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DIMENSIONS OF TEACHING EFFECTIVENESS: 


A STUDENT PERSPECTIVE 


BETTY J. HASLETT 
University of Delaware 


ABSTRACT 


to assess the general, underlying 
Forty-one semantic differential scales measuring the concept of a good teacher n desee m apicem judged teachers - 
judgmental dimensions which students use in evaluating effectiveness in teaching. Bo * = stimulation. À personalization factor, measu! 
the dimensions of student/teacher rapport, communicative style, instructional style, college students and conceptually disting 
s the teacher's ability to personalize and make relevant class materials, was also found nore: ei students were also investigated. 
ached them from the high choni students. In addition to class level differences, sex differences ac 


THE SEARCH FOR EFFECTIVENESS in teaching has 
been a focal point of educational research for decades. Re- 
cently, the increasing pressure for academic accountability 
has focused more attention on the evaluation of instruction. 
One technique for evaluating instructional effectiveness is 
the use of student ratings of instructors and courses, [n 
their comprehensive review of the literature on evaluations, 
Costin, Greenough, and Menges (6) concluded that student 
ratings of instruction were valid and reliable 
various criteria of instructional effectiveness. 

In studies involving student ratings of instructor and 
course, an increasing number have utilized factor analy tic 
techniques to uncover the underlying Judgmental dimensions 
involved in evaluating teaching behavior. These general 
factors identify basic components of the teaching process 
and are therefore of importance in training and evaluating 


teachers. Meredith (15) concluded that research on the 
dimensionality of stud ali 


in assessing 


model isolating the factors of gene 


| ral course attitude, in- 
structional method, student intere: 


st and attention, course 
and specific procedures, 
athrop, and Schuerger (9) sup- 
f course attitudes: they found 
led as general course attitude; attitude 
Loward examinations; attitude toward method; instructor/ 
student rapport; and attitude toward workload, A review 
of seven-factor analytic studies among co 
Cashin (3) revealed several factors c 
These common factors were: 
or difficulty level of courses; 


content, instructor characteristics, 
The research of Finkbeiner, I 
ported a multifactor model o 
five factors interpre 


llege students b 
ommon to all the studies. 
course organization; workload 
student-teacher rapport and 


A impact; stim- 
interaction; general teaching skills; insteaetor ir A 
ulation and interest; and grading and iier Trad 
However, these studies suffer from severa c : rely 2” 
First, factor structures generated from these T 
actual, observed conduct of the teacher in the i partic 
thus sampling a particular student’s experience in "i actors 
ular class at a specific time. Such dimensions sugges a 
operating in particular educational contexts, not 3 ans (7) 
underlying components of teaching behavior. As Y e ing 
has observed, little reliable information is available jor 
good teaching: one of the major reasons c z i 
this situation, he concluded, was “the lack of any € charac" 
derstanding of the various patterns of behavior that 
terize teachers in general” (17:1). 


n- 


| 


i ies is that | is 
A second limitation of factor analy tic studies is nt 


e 
P uc 
I college st 
such studies have been done exclusively with college 5U' 


JI signifi’ 
archers (17, 19) found that class level sign j 
z K achi affectivene S?” 
influenced students’ judgments of teaching [ cont peal " 
These findings suggest that studies assessing the dim “I e 
ality of student ratings of courses and instructors sho ki 


qnie 


Several r 


extended to include students at other academic kei. 
researcher is aware of only one factor analytic vei * 
ducted by Smalzreid and Remmers (18), which t 
high school students. Smalzreid and Remmers use fact? z 
Purdue Rating Scale for Instructors and found two rently 
an empathy trait (which appears to be similar to yy je 
teacher interaction factors in college factor analy tic " am 
and a professional maturity trait (which appears to meld 
ilar to teaching method and instructor competency z ni 
in college factor analy tic studies). This study, howeve™ 


d 
n 
pati 
only a selected subsample of items from the Purdue 
Scale for Instructors. 


pl. 


1 
} 
1 
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The general question to be answered, then, is whether 
certain types of students view good teachers in ways that 
differ from the perspective of other students. More specif- 
ically, the present research attempted to (a) characterize 
the general dimensions that underlie students’ assessment 
of teaching behavior; and (b) assess what changes, if any, 
occur in students’ general evaluations of teaching effective- 
ness as a function of the educational experience (either 


high school or collegiate experience) and sex of the student. 


It was hypothesized that the basic underlying dimensions 
of assessing effective teaching would be the same for both 
high school and college students. However, the relative 
contribution toward effective teaching that each dimension 
made would differ for the two groups. 


Method 


Subjects 

Ss were 667 high school students and 219 college 
students. There were 282 males and 385 females in the 
high school subject pool, and 98 females and 121 males in 
the college subject pool. 


Design 

The general abstract concept of a “good teacher" was 
selected to be measured because it assesses one’s attitude 
toward teaching in general and one's judgment of the var- 
ious criteria used to evaluate teaching. A semantic differ- 
ential was constructed to evaluate the concept of a good 
teacher. Fifteen high school students wrote essays describ- 
ing the best teacher as well as the worst teacher they had 
had. From these essays, a set of adjectives was abstracted 
and a series of bipolar adjectival scales for the concept 
constructed. Sixty scales were used to evaluate the concept 
of a good teacher. A pilot study was conducted among 90 
college students to evaluate each potential scale item. 

In addition to evaluating their instructor using the test 
instrument, these students were asked to delete any scales 
they thought irrelevant or unimportant, us e _ 
reflecting qualities they believed important or effec y 
teaching if not already included in the test instrument. 

On the basis of the deletions and additions a 
students in the pilot study and the experimenter’s de "qm 
of redundant or unreliable scale items, 41 scales were sel- 
ected for inclusion in the final form of the test instrument. 


Procedure 

The instrument was administered to the high school 
students in their English classes. Students were told that 
this research was to aid in constructing instruments for the 
evaluation of instruction. Students were also told to rate 


. good teaching in general and not the instructor of any 


class they were taking or had taken in the past. Four col- 
lege classes from the state university were selected to 
participate in the study and represented a cross section of 
Students from the humanities, social, and natural sciences. 


Directions for using the semantic differential scales 
were printed on the top of the instrument scales and were 
read to the Ss. Ss were encouraged to answer every scale. 
The E or the research assistant worked through several 
examples to further illustrate use of the scales. 


Data Analysis 


The data analysis done in this study reported the results 
of factor analyses using a principal components solution 
with an orthogonal, varimax rotation done on the scales 
measuring the concept of a good teacher for both high 
school and college students. Unities were inserted for com- 
munalities, and the point at which substantive increases in 
the cumulative proportion of variance were no longer be- 
ing made served as a cut-off point in the iteration of factors. 
A principal components solution was used to generate or- 
thogonal, independent factor scores for each S. Multi- 
variate analyses of variance were also done to test for 
significant differences in student scores across the scale 
items as a function of student sex and educational exper- 
ience. 


Results 
High School Students’ Judgmental Dimensions in Evaluating 
Good Teachers 


After a number of varimax rotations were done, it was 
found that a four-factor solution accounting for 41% of 
the total variance was the most meaningful interpretation 
of the data. The first factor, student/teacher rapport, was 
characterized by the qualities of trustworthiness, fairness, 
cooperativeness, and openness, and accounted for 54% of 
the variance explained by the factors. Factor II, commun- 
icative style, accounted for 20% of the factor variance and 
was measured by qualities such as ease or difficulty in un- 
derstanding the teacher's remarks, being comfortable in the 
classroom, being interesting and available for student con- 
sultation. Instructional style, the third general evaluative 
dimension, accounted for 14% of the factor variance and 
reflected teaching skills such as general organization, knowl- 
edge of the material, experience, and intelligence. The last 
factor, stimulation, reflected how challenging, strict, and 
difficult a teacher was, and accounted for 12% of the factor 
variance. (The factor structure of effective teaching ratings 
among high school students is presented in Table 1.) 


College Students’ Judgmental Dimensions in Evaluating 
Good Teachers 


After a number of varimax rotations, a five-factor 
solution accounting for 43% of the variance was found to be 
the best interpretation of the data. The first factor, student/ 
teacher rapport, accounted for 50% of the factor variance 
and was measured by scales such as responsiveness, fairness, 
trustworthiness, and concern for students. Factor Il, instruc- 
tional style, accounting for 15% of factor variance, was 
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Table 1.—High School Students' Factor Structure for the Concept of a Good Teacher” 


I II 


Student/Teacher Rapport 


clear .140 interesting 
trustworthy -738 admits errors 
fair -680 open-minded 
presents other comfortable in 
views -670 class 
concerned 635 available 
cooperative 3573. originality 
humor .569 easy to 
responsive 545 understand 
easy to talk to .492 
competent -486 


Communicative Style 


.626 
.594 
541 


519 
518 
.486 


448 


Til IV 
Instructional Style Stimulation 
experienced .665 challenging .738 
organized .646 strictness .671 
intelligent — .598 demanding .651 
sticks to the 

point 573 
knowledge of 

material .515 
precise 463 


*— Factor structure and variable loadings as obtained from SSPS factor analytic program using 


varimax rotation 


Table 2.—College Students' Factor Structure for the Concept of a Good Teacher * 


I Hu 


II 
Student/Teacher Instructional Style Communicative Style 
Rapport 
fair 734 intelligent — .671 admits errors 
clear -680 sticks to the humor 
trustworthy .672 point .643 informal 
concerned — .662 knowledge originality 
responsive «657 of mater- congenial 
decisive .524 ial 557 interesting 
presents other 
views 479 organized — .489 
competent .478 experienced .487 
flexible 474 appears com- 
fortable in 
class 449 
interesting 446 
energetic 426 


IV b 


Stimulation Personalization 


.584 demanding .695 uses classtime 

557 critical .687 effectively .731 
527 challenging .614  personalizes 

-466 material 102 
456 sensitive 468 
444 open-minded — .442 


SS O o 


*F " " 
Factor structure and variable loadings as obtained from SSPS factor analytic program using varimax rotation 


measured by scales such as or: 


perience, knowledge of the material, and interestingness. 
The third factor, communicative style, accounted for 13% 
of the factor variance and was measured by scales such as 
congeniality, sense of humor, willingness to admit mistakes 
and informality. Factor IV, stimulation, accounted for 12% 
of the variance and measured a teacher’s ability to be i 
challenging and stimulating. The last factor, personalizatio 
accounted for 10% of the variance and r i 
ability to add a personalized, human qu 
This factor was measured by scales suc 
alize and make class material relevant, and the ability to be 
sensitive and open-minded. (The factor struc 


:ture for college 
students’ evaluations of teaching effectiveness is set fh 
in Table 2.) 


ganization, intelligence, ex- 


ality to his teaching. 
h as ability to person- 


eflected the teacher's 


Sex Differences in Evaluation of Good Teaching at 
ime 
Since the sex of students participating in the A 
was known, the data were analyzed to see what di p 
if any, existed between the evaluations of males a" or? 
males with regard to teaching effectiveness. Factor ê 
for each S were generated and a multivariate analysl? m 
variance done to assess differences in judgments bet" 
males and females. Among the college students, only cat! 
Factor I, student/teacher rapport, showed any sign!” 
differences in judgments between males and female pg 
females found good teachers to be significantly more 4 
ponsive, trustworthy, concerned, and so forth than 
males, F(1, 213) = 8.42, p < .01. Among high scho? 
students, only Factor III, instructional style, show? 


L 
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significant differences in evaluations of effective teaching 
between males and females, F(1, 665) = 4.51, p <.05. 
Females judged good teachers to be significantly more 
organized, experienced, intelligent, and knowledgeable 


than did males. 


Student Sex and Academic Environment: Their Effects 
across Individual Semantic Differential Scales 


In order to analyze the effect of student sex and 
academic environment more closely, a multivariate analysis 
of variance was done on the semantic differential scales to 
explore their effects upon students’ evaluations of the 
characteristics of effective teachers. Significant differences 
in students’ evaluations were found from the main effects 
of student sex and academic environment, F(41, 844) = 
2.53 and 5.63, respectively; p < .001. Fifteen of 41 scales, 
or 36% of the scales, showed significant differences be- 
tween male and female in their assessment of various 
criteria measuring teaching effectiveness. Overall, females 
rated good teachers more highly than males on 90% of 
the scales. : : 

With regard to the influence of academic environment, 
college and high school students differed significantly from 
one another in their judgments on 21 of 41 scales, or 51% 
of the scales. College students judged effective teachers 
more positively than did high school students on 36 of 41 
scales, or 87% of the scales. Table 3 presents the overall 
means for all students categorized by sex and academic 
environment. There was s significant interaction between 
sex X academic environment on only one scale, “agressive- 
ness,” with college males judging teachers as being most 
agressive, followed by high school males, college females, 
and high school females, F = 6.07, p <.01. 


Discussion 

Both groups of students, high school as well as college, 
appeared to have very similar judgmental dimensions with 
regard to evaluating good teaching. Both factor analyses 
revealed a fundamental duality in judgment: teachers were 
judged as professionals (e. g., in their role as oai. a 
as individuals (e. g., as unique jeu any cei h- 
ers (7, 17) have observed that educators need to un er- 
stand what the basic behavioral components of teaching 
are before one can begin to evaluate and measure teaching 
effectiveness, as well as train effective teachers. i 

Both high school and college students judged teaching 
on the dimensions of student/teacher rapport, communi- 
cative style, instructional style, and difficulty or stimula- 
tion level. Although the variance accounted for by these 
evaluative dimensions varied for each group, and there were 
some differences in the variables loading on these factors 
for each group, the similarity of the four factors for both 
high school and college students suggests that these four 
evaluative dimensions are basic, fundamental skills needed 
for good teaching, regardless of subject matter taught, 


level of students taught, teacher personality, or a number 
of other variables. 

Previous factor analytic studies of student ratings (20) 
have found evaluative dimensions similar to the judgmental 
dimensions found in the present investigation. Various 
dimensions of effective teaching may interact with one 
another to produce particular teaching styles that are most 
effective for certain educational contexts, students, or sub- 
ject areas: not all dimensions will be equally stressed for 
differing contexts, students, or subject areas. Yet the basic 
similarity of the judgments found in this study and in pre- 
vious investigations suggests that these four facets of teach- 
ing behavior would undoubtedly be reflected in any par- 
ticular teaching strategy. 

Past research on the qualities of good or ideal college 
teachers (1, 4, 10, 11) found that qualities such as thorough 
knowledge of the subject matter, clarity, open-mindedness, 
and the ability to be interesting were characteristics assoc- 
iated with effective teaching. The present study found these 
qualities to be valued by both high school and college 
students: other studies have shown somewhat similar rank- 
ings and agreement among students, faculty, and admin- 
istrators (2, 8, 16). Table 4 gives a comparison of the rank 
order of the ten most valued characteristics for effective 
teachers by high school and college students in the present 
study (selected by rank-ordering the scales from highest to 
lowest mean—the higher the mean, the more valued the 
scale), as well as the ten most valued qualities found in the 
Bousfield, Clinton, and Perry studies (1, 4, 16). 

The personalization factor, not found in previous 
factor analytic studies among college students, may reflect 
a change in American society and higher education over 
the last decade: a sense of growing alienation and deper- 
sonalization as documented in Toffler’s Future Shock and 
Reich’s The Greening of America. The educational exper- 
ience in high school is more personal and individualized 
since classes are usually small and students and teachers 
typically know each other to some extent, whereas the 
college experience is more impersonal and isolated due to 
large classes, the greater number of students enrolled, and 
the impersonal contact between teacher and student, and 
often times, between student and student. 


Given the substantial changes in ideas and attitudes 
students undergo during their college years (20), students 
undoubtedly feel the need for personal attention, relevancy, 
and meaning in their interaction with others in the academ- 
ic community during this period of personal growth and 
change. Since for many students teachers provide the 
only source of extended personal contact within the 
academic structure (excluding, of course, personal friend- 
ships that develop among students), it is not surprising 
that college students believe that teachers in particular 
ought to personalize their teaching and make their subject 
areas meaningful and relevant for students. Such an attitude 
may in part be documented by the growth of programs 


—————————— 
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Table 3.—Effect of Academic Environment and Student Sex on Student Attitudes toward Good Teachers * 


p value p value 
Scales Sex effect Environmental effect 
Females 
Agressive/ Unagressive 5.19 NS .010 
Easy to talk to/ 

.Not easy to talk to 6.29 6.28 6.31 6.57 NS NS 
Flexible/Inflexible 5.39 5.30 5.68 5.86 NS .001 
Personalizes material/ 

Material presented 

abstractly 4.52 4.67 5.10 5.24 NS 001 
Uses classtime effectively/ 
Does not use classtime 
effectively 5.49 5.73 6.14 6.28 010 001 
Open-minded/Narrow- 

minded 6.26 6.36 6.32 6.61 NS NS 
Sensitive/Insensitive 4.95 4.92 5.5] 5.85 NS 001 
Informal/Formal 4.95 5.18 5.28 5.29 NS NS 
Sense of Humor/Humorless |6.08 5.96 5.96 6.13 NS NS 
Shows originality/Does not 

show originality 5.93 5.97 6.04 6.21 NS NS 
Challenging/Easy 4.85 5.00 5.20 5.52 NS 001 
Strict/Lenient 345 3.60 372 382 NS 010 
Unprejudiced/Prejudiced — |6.09 6.06 623 — 631 NS 030 
Intelligent/Unintelligent 443 4.33 4.85 5.01 NS 001 
Precise/Imprecise 6.25 6.47 6.40 6.52 010 NS 
Sticks to the point/ 

Digresses 4.93 5.21 5.39 5.40 .020 -010 
Critical/Uncritical 6.10 6.22 6.15 6.47 NS NS 
Fair/Unfair 5:77 6.14 5,13. 6.13 001 NS 
Energetic/ Unenergetic 6.51 6.62 6.52 6.67 .050 NS 
Trustworthy/Untrustworthy |5.90 6.07 6.33 6.52 .030 001 
Experienced/Inexperienced |5.22 5.13 5.89 5.84 NS .001 
Congenial/Uncongenial 6.09 6.27 6.24 6.47 .010 .040 
Knows Material/Does not 

know material 5.99 6.19 6.26 6.52 .010 .001 
Concerned/Indifferent 6.42 6.40 6.47 6.71 NS .020 
Competent/Incompetent 6.08 6.20 6.00 6.27 NS NS 
Easy to understand/Difficult 

to understand 5.62 6.01 5:97 6.13 -001 NS 
Cooperative/Uncooperative | 6.07 6.02 6.23 6.35 NS 004 
Demanding/Undemanding 5.41 5.21 5.81 5.92 NS .001 
Appears comfortable in class/ 

Does not appear com- 

fortable in class 4.58 4.70 4.58 4.57 NS NS 
Extroverted/Introverted 4.37 4.19 4.95 4.63 010 001 
Considers student opinions/ 
Does not consider student 
opinions 6.34 6.55 6.40 6.66 .001 NS 
Does not show favoritism/ 

Shows favoritism 5.74 8.89 6.05 6.28 .040 .001 
Interesting/Uninteresting 6.36 6.48 6.36 6.57 NS Ne 
Decisive/Indecisive 5.58 5.82 5.97 6.16 .007 -002 
Responsive/Unresponsive 5.45 5.56 5.16 5.56 NS NS 
Presents different view- 

points/Presents only 001 

one viewpoint 6.28 i NS s 
Available/Unavailable 5.96 "xi fes 641 .003 001 
Clear/Unclear 5.85 5.93 641 6.58 NS aoi 
Organized/Unorganized 6.27 6.38 621 6.33 NS NS 
Admits mistakes/Does not : ` 

admit mistakes Á NS 
Important to me/Unim- " ess 635 6.52 2h 

portant to me 5.47 5.85 5.41 5.81 .001 NS 


* ine i 

= ed conducting F tests with 1 degree of freedom in the numerator and 842 degrees 
in the denominator. cales are consistently arranged with the left-hand member of the bipolar adjectives 
representing the more positive judgment of that value (and having high numbers represent that end of the 
continuum) and the right-hand member Iepresenting the more negative evaluation of a particular charac- 
teristic (represented by lower numbers). For some scales, such as *Formal/Informal," where a judgment 


of which attribute is preferred is not clear, this scale arrangement has been decided arbitrarily by the ex- 
perimenter. 


Table 4.—Characteristics of Good Teachers* 


Clinton (1930) 


Knowledge of 
subject 
Pleasing personality 


Neatness in work 
and appearance 

Fairness 

Kind, sympathetic 

Sense of humor 

Interest in profession 

Interesting style of 
presentation 


Alertness and broad- 
mindedness 


Knowledge of methods 


Bousfield (1940) 

Fairness 

Mastery of subject 

Interesting style of 
presentation 

Well organized 

Clarity of presentation 

Interest in students 

Helpfulness 

Ability to direct 
discussion 


Sincerity 


Keen intellect 
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Perry (1971) 


Well prepared for 
class 

Sincere interest in 
subject 

Knowledge of 
subject 

Effective teaching 
methods 

Tests for under- 
standing 

Fairness 


Effective commun- 
ication 
Encourages inde- 
pendent thought 
Logical organiza- 
tion of course 
Motivates students 


High School Ss 
Clarity 
Trustworthiness 
Challenging 
Fairness 
Strictness 
Presents others’ 
views 

Experienced 
Organized 
Concern for students 


Interesting 


College Ss 
Knowledge of 
subject 
Interesting 
Clarity of presen- 
tation 
Fairness 
Competency 
Trustworthiness 
Open-mindedness 
Admits mistakes 


Responsiveness 


Available to 
students 


—————————————M MM ——À— Me 


*—In rank order of their importance in each study 


likc Women's Studies, Black Studies, and Criminal Justice, 
which reflect student and societal demands for relevancy 
and social impact. 

Since the current investigation involved a relatively 
small college population and sampled a limited geographic 
locale, these results can only be suggestive of a new dimen- 
sion of teaching effectiveness. Additional research will be 
necessary to explore further the personalization factor and 
examine its significance for theories of instruction, learning, 
and personal growth. 

The sex differences found in the factor structures for 
high school and college students centered around two 
factors: female college students rated an effective teacher 
more highly on the rapport dimension than did males, . 
and female high school students rated effective teachers 
instructional style more highly than did male high school 
students. The difference among the high school students 
was interpreted as a reflection of the more negative, 
critical contacts that male high school students have with 
their teachers (12): males, as a result of this negative inter- 


action, will have a less positive opinion of teachers in gene ral. 


The female college students’ higher rating of an effective 
teacher’s rapport was interpreted as a result of their greater 
response to warmth and openness on the part of teachers 
(14). When examining individual semantic differential scales 
for sex differences across scales, it was found that females 
rated good teachers significantly higher on scales measuring 
the teacher's warmth and openness. This was thought to 
reflect not only the female college student’s greater res- i 
Ponsiveness to a teacher's warmth but also the more positive 
Contacts that high school females have with their teachers 


(12). Generally, females rated good teachers more highly 
than did males. 

College students rated effective teachers more positively 
than did high school students. Examining these differences 
across individual semantic differential scales, it was found 
that college students rated effective teachers significantly 
higher on three general parameters: a teacher's competency 
(measured by scales such as knowledge of material, in- 
telligence, and experience); a teacher's level of difficulty 
(reflected by scales such as being challenging and demand- 
ing); and a teacher's responsiveness (measured by scales 
such as extroversion, concern, availability, and sensitivity). 
These differences were interpreted as reflecting the greater 
degree of educational difficulty in college—students, as well 
as teachers, are expected to be more knowledgeable and 
competent in general, as well as responsive to more vigorous 
intellectual demands. The personalization factor found 
among collegians was also thought to influence their rat- 
ings of a teacher's responsiveness. In general, college 
students rated good teachers more positively than did high 
school students; this may generally reflect the higher value 
college students place upon education since such students 
have elected to further their education. 

As Trent and Cohen (20) concluded in their review of 
research on teaching in higher education, there is a need 
for research on societal expectations that define a teacher's 
role. Such knowledge is necessary for effective teaching, 
adequate teacher training, and accurate teaching evaluation. 
The Levinthal, Lansky, and Andrews study (13) found that 
there is an interaction between a students concept of an 
ideal teacher and that students ratings of actual instructors. 


put 
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Before one can properly evabvate a student's ratings of 

instruction it is necessary to understand the educational 
values and role expectations of teachers which guide the 
student's ratings (17). The current research was a step in 


w 


10. 


the 


direction of assessing some of these values and role 


expectations among students of differing sex and educa- 
tional experiences. 
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AN EMPIRICAL ANALYSIS OF THE INSTRUCTIONAL 


EFFECTIVENESS IN VISUALIZED INSTRUCTION 


THOMAS C. ARNOLD 
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The Pennsylvania State University 


ABSTRACT 


The purpose of this study was to investigate the relative effectiveness of specific media attributes on student performance on criterion 
tests measuring different levels of understanding. Specifically, it attempted to identify which of two levels of stimulus explicitness in 
visuals was most effective in facilitating student achievement on criterion tests of knowledge, comprehension, and total understanding 
for students possessing two different levels of entering behavior. One hundred seventy-one subjects participated in the study. The two- 
way ANOVA procedure was utilized to investigate the existence of interaction between entering behavior and level of stimulus explicitness. 
Results indicated that a significant relationship existed between entering behavior and performance on post-criterion tests; no relationship 
existed between stimulus explicitness and achievement on the criterion tests; and insignificant interactions were found to exist between 


entering behavior and instructional treatment. 


A NUMBER OF EDUCATIONAL RESEARCHERS 
(2, 4, 6, 7) commenting on teaching effectiveness have 
indicated that current media research has not incorporated 
instructional techniques based on sound instructional 
and/or psychological research. A great many of the studies 
seem to be primarily concerned with conducting evaluative 
comparisons to support the use of one form of media in 
preference to another, while providing little insight con- 
cerning the effectiveness of attributes inherent to a 
particular medium. Recently educators have encouraged 
the development of research designs which would not only 
evaluate the relative effectiveness of different media but 
would also identify instructional strategies by which given 
types of learners would achieve optimally. One of the 
theoretical orientations which has emerged as a result . 
of this trend was designed by Salomon (6) and is used in 
this study. His theory of stimulus explicitness attempts 
derstanding of how the use of media influ- 


to present an un 5 ! 
: his theory was influential on the 


ences learning. Since t Bà H 
design of this study, a brief synopsis is warranted. 


Theoretical Orientation 


It is Salomon 5 belief that for stimuli to be effective in 
learning they must affect mental processes in the oe 
relevant to the task being learned. The stimulus explicitness 


theory assumes that one of the most fundamental functions 


of visual stimuli is to inform, that is, to reduce uncertainty 
of achieving a 


and thus increase the learner's probability “i 
correct response relevant to the learning task. He further 
suggests that the instructional effectiveness of a given type 


of visual stimulus in reducing uncertainty is contingent 


upon the prior existence of aroused uncertainty in the 
individual. 

Different types of visual materials contain varying 
amounts of realistic detail which, in turn, can be con- 
sidered to represent varying degrees of stimulus explicitness. 
For example, if in a learning situation the individual does 
not experience any uncertainty, his behavior might 
depict what is normally called the boredom syndrome— 
daydreaming, doodling, etc. However, if the stimulus 
materials used in the instructional situation generate some 
uncertainty, the learner may be motivated to search for 
additional information in order to reduce this uncertainty. 
If too much uncertainty is introduced into the learning 
situation, it may cause the learner to react negatively 
towards the simulus materials and reject the purposes for 
which they were originally designed. 

This assumption finds support in the theory and from 
information theorists (3, 5, 6) who report that as the 
amount of information in the stimulus increases, the uncer- 
tainty generated by the stimulus decreases. Figure 1 illus- 
trates this relationship between uncertainty and stimulus 
explicitness. 

In this continuum the amount of uncertainty in a 
stimulus is a function of the amount of information 
conveyed by the stimulus. As the amount of information 
in a visual increases, the uncertainty generated by the 
visual decreases. In terms of information theory this 
could be interpreted to mean that visuals possessing 
higher degrees of explicitness should have a greater , 
potential for reducing entering uncertainty, thus increasing 

the probability that a greater amount of learning will occur. 
Figure 2 illustrates the relationship between Salomon ’s 
theory and information theory. This diagram graphically 
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depicts the relationship between learning peobulility and , 
the amount of uncertainty generated by visuals containing 
different amounts of stimulus explicitness. 

This figure illustrates that the probability of learning 
increases as the uncertainty in the stimulus decreases due 
to an increase in explicitness of the stimulus. However, 
since Salomon predicts that this curve will vary depend- 
ing on the learners prior cognitive experience, he suggests 
that the direction of the learning curve will be dependent 
on the entering uncertainty of the individual. This projected 
relationship between entering behavior, learning probability, 
stimulus explicitness, and stimulus uncertainty is shown 
in Figure 3. 

The purpose of this study was to evaluate the predict- 
ability of Salomon s stimulus explicitness theory by inves- 

tigating the instructional effectiveness of two types of 
visual stimuli each possessing different degrees of stimulus 
explicitness. Specifically, the purposes of this study were 
to: (a) explore the research potential of the stimulus 
explicitness theory as a model for guiding research on 
visualized instruction; and (b) determine the instructional 
effectiveness of visual materials possessing different degrees 


of stimulus explicitness and also their effect on students 
possessing different entering behaviors. 


Method 
Materials 


The materials for this investigation consisted of two sets 
of instructional programs designed in textbook format. 
The printed subject matter transmitted via these instruc- 
tional packages was held constant, with e 
consisting of 37 pages. E 
by 3% inch illustration o 
designed to complement 
that page. 


ach package 
ach page contained a 2V; inch 

f the human heart that was 

the printed content material on 


Treatment Croups 


One hundred Seventy-one college students enrolled in 
the Instructional Media 411 course at the Pennsylvania 
State University participated in this study. This course 
provides the orientation and competencies recommended 
by the State of Pennsylvania as necessary requirements 
for teaching certification. The findings of this study may 
well be generalized to students majoring in education, 

Students were assigned to one 
ior groups as a result of their perf 
employed for this purpose. Members of each entering behay- 
iro group were then randomly assigned to one of two 
treatment groups. These treatment groups received identical 


wrilten presentations; however, each of the 


of three entering behay- 
ormance on a pre-test 


two groups 
illustrations con- 
citness (uncer- 


received their own respective type of visual 
taining one of two degrees of stimulus expli 


Uncertainty Decreases 


=e 


4 high 


low |—— 


Amount of Information in a Stimulus 


Figure 1.—An uncertainty continuum 


Stimulus Uncertainty 
high low 


Learning 
Probability 


increases 


low high | 
Stimulus Explicitness | 


Figure 2.—Relationship between learning probability and the 
explicitness or uncertainty in a stimulus 


Stimulus Uncertainty 
high low 
high high . 
Learning 
Probability 


Entering 


Behavior 


low low 


low high 
Stimulus Explicitness 


Figure 3.—Relationship between entering behavior, learning 
probability, and the explicitness or uncertainty in a stimulus 


tainty). Detailed, shaded drawings (high stimulus explicit” 
ness/low uncertainty) of the human heart were em ployed 
in the instructional packages to complement the printed 
instruction. 

Figure 4 shows the experimental design for this study- 
In order to insure significant differences between entering 
behavior groups and to provide reliability in assignment tO 
the different groups, only those students identified as 
achieving high and low entering behavior on the content 
pre-test were used in this study. The statistical procedure 
employed for this purpose involved the establishment of 
confidence limits about a student's obtained score, and 
resulted in the probability of .95 that students used in 


this study were, in fact, correctly assigned to the proper 
treatment groups. 
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Instructional Treatment 


Entering 
Behavior 


Figure 4.—The experimental design 


High 
Entering 
Behavior 


Low 
Entering 
Behavior 


high entering behavior—simple pictures 
high entering behavior—detailed pictures 
low entering behavior—simple pictures 
low entering behavior—detailed pictures 


Tow» 


B. 
High 
Stimulus 
Explicitness 


A. 
Low N=28 


Stimulus 
Explicitness 


D. 
High 
Stimulus 

Explicitness 


N=27 


c. 
Low 
Stimulus 

Explicitness 


N=30 


Figure 5.—The experimental design with derivative null hypothesis 


Criterion Measures 
Fach student received a pre-test before receiving his 
t. In addition, each student also 


respective treatmen 
al test which consisted of two 


received a 44-item crilic 
subtests, each designed to measure a specific level of 
cognitive ability. Achievement scores on the two criterial 
measures were the dependent variables, and the degree 
of stimulus explicitness in the visualized treatments 

was the independent variable. 


On the pre-test (K-R 20, r = .64) students were given 
a diagram of the heart and a series of questions asking 
them to identify specific parts of the heart. After com- 
pletion of the instructional booklet, Ss received the 
44-item post criterial test which consisted of the subtest 
of knowledge (K-R 20, r = 80) and the subtest of 
comprehension (K-R 20, r = .65). Scores from the two 
subtests were combined to provide a measure of total 


understanding (K-R 20, r = .85). 


————— 
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Data Analysis 


A two-factor analysis of variance was the statistical model 
used to analyze the data obtained. Difference were con- 
sidered significant at the .05 level. The two-way analysis of 
variance model was also used to investigate the interaction 
effect between entering behavior and instructional treat- 
ment. The data obtained in this study are listed as collected 
for specific sub-problems. Because the use of the 2 X 2 
factorial design often provides for the testing of numerous 
statistical hypotheses, Figure 5 has been provided to aid 
the reader in visualizing the derivation of the null hypotheses. 
This figure represents the experimental design that was 

followed for each of the three sub-problems. Table 1 
illustrates the means and standard deviations for students 
in each treatment group on each criterion measure. 
Unweighted means were used in the ANOVA procedures 
since the study was exploratory in nature and there was 
no reason to expect that one treatment would be 
effective than the other. The ANOVA used w 
handle unequal Vs; furthermore, the number 
in each cell was not greatly 
variance would be se 
homog 
the ob: 


more 

as adapted to 
of students 
disproportionate so that the 
riously affected. Bartlett's test for 
eneity was on the pre-test scores, and in no case did 
served values reach the critical value for a .05 level 
test. Thus, it appeared that the students receiving the 
different treatment could, in fact, be considered to have 


been drawn randomly from populations with common 
variance, 


Sub-Problem # 1 

Which of two levels of stimulus explicitness is most 
effective for learners identified as having either high or 
low entering behavior as measured by a test of knowledge of 
terminology on achieve 


ment in learning as measured by 
the total test of understanding? (See Table 2.) 


Table 1. 


High stimulus 
explicitness 
W = 29) 


Criterion measures 


Total test of 
understanding 

Test of 
knowledge 

Test of 
comprehension 


15.62 4.42 16.50 


Low stimulus 
explicitness 
(N =28) 


Table 2.—Results of the Two-Factor ANOVES of the Data from 
Entering Behavior and Treatment as Measured by the Total Test 
of Understanding 


Source of variation 


Entering behavior 763.87 


* 0.42 
Treatment 1 3135 67 
Entering behavior X 1 1.09 02 0.88 
treatment 


Within groups 110 — 46.59 


*Significant at 0.05 


Sub-Problem & 2 


Which of two levels of stimulus explicitness is most 
effective for learners identified as having either high or 
low entering behavior on achievement in learning as 
measured by the specific criterion test—the test 
of knowledge? (See Table 3.) 


Table 3.—Results of the Two-Factor ANOVES of the 


Data from Entering Behavior and Treatment as Measured by the 
Test of Knowledge 


Source of variation 


Entering behavior 


1 296.14 17.2* 0.00 
Treatment 1 21.53 1.25 0.27 
Entering behavior X 1 0.003 0.00 0.99 
treatment 
Within groups 17.20 


*Significant at 0.05 


Sub-Problem #3 


Which of two levels of stimulus explicitness is most 
effective for learners identified as having either high or 
low entering behavior on achievement in learning as 
measured by the specific criterion test—the test of 
comprehension? (See Table 4.) 


—Means and Standard Deviations for Students in Each Treatment Group on Each Criterion Measure. 


High stimulus 


Low stimulus 
explicitness explicitness 
(N = 27) (N = 30) 


6.14 2381 7.24 24.67 7.01 
3.61 1241 421 1327 429 
11.41 349 l3 — dar 
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Table 4.—Results of the Two-Factor ANOVES of the Data from 
Entering Behavior and Treatment as Measured by the Test of 
Comprehension 


Source of variation 


Entering behavior 1 108.77 9.96* 0.00 


Treatment 1 0.90 0.08 0.77 
Entering behavior X 1 0.98 0.09 0.77 
treatment 


Within groups 110 10.92 


"Significant at 0.05 


Results 


Three conclusions were derived from the data obtained 
in this study: 

1. There was a significant relationship between entering 
behavior and performance on criterion tests. Those students 
whose prior experience and knowledge of the content 
material were high performed more effectively than those 
with low entering behavior regardless of the type of visual 


illustration they received. 
2. No significant relationship was found to exist between 


the level of stimulus explicitness and achievement on the 
criterion tests. This was interpreted as meaning that visuals 
possessing either of the stimulus explicitness levels were 
equally effective in improving achievement of identical 
objectives for each of the entering behavior groups. 

3. No significant interactions were found to exist between 
entering behavior and type of visualization. 


Summary and Discussion 


The purpose of this study was to investigate the 
relative effectiveness of specific media attributes on 
student performance on criterial tests measuring different 
levels of understanding. More specifically, this study was | 
designed to gather data to ascertain which of two M 
stimulus explicitness in a series of visuals provided for the 
most effective instruction as measured by achievement on 
criterial tests of knowledge, comprehension, and total - 
understanding for students possessing two different levels o 
entering behavior. ; 

The theoretical rationale for this study was Salomon's 
theory of stimulation. Thus, while gathering data for 
purposes of investigating the above problem, it was also 
the objective of this study to investigate that portion 
of Salomon 's theory that pertained to the role of visuals in 
the instructional process. The assumption of the theory s 
having application to this study implied that research in 
involving the stimulus-explicitness dimension of visuals 


should provide educators with the means to determine how 
effective a specific visual is. That is, research associated with 
this dimension should provide data to determine how 

much cognitive activity occurs as a result of exposure to a 
specific kind of stimulus presentation by a particular 
learner. 


These two objectives are related in the following manner. 
The stimulus-explicitness dimension in a visual is a function 
of the amount of information conveyed by the visual and 
the ability of the information to reduce the learner’s 
uncertainty, thereby increasing his probability of learn- 
ing whatever message that visual was designed to convey. 
The theory postulates that data gathered from research 
associated with this dimension should indicate the degree 
that cognitive processes are activated after exposure to this 
stimulus attribute. Stated in reference to this study, if the 
stimulus-explicitness property of a visual affects the 
probability of reducing aroused uncertainty, then different 

levels of stimulus explicitness should activate different 
cognitive processes. If this were the case, then at Bloom’s 
cognitive levels of terminology, and comprehension, one 
would expect that different visuals would differ in their 


effectiveness to improve learning measured at these cogni- 


tive levels. In other words, according to the assumptions 
of the theory one would expect high entering behavior 
students receiving less stimulus explicitness in visuals to 
perform as effectively at the same cognitive levels as 

low entering behavior students who received visuals 
possessing a higher level of stimulus explicitness. This 
assumption is predicated on the concept that high entering 
behavior students initially experience lower levels of 
uncertainty and consequently require less explicitness 

to achieve an equal probability for successful performance 
on achievement tests measuring learning at different 

levels of understanding. Conversely, if given the higher 
levels of explicitness in visuals, high entering behavior 
students should experience a reduction in their probability 
to attain high achievement because they are not receiving 
the optimum form of instruction. 


Applying these expectations to low entering behavior 
students, one could expect those receiving the optimum 
form of instruction to perform better on the achieve- 
ment tests for each cognitive level than those receiving 
the less optimal instruction. That is, low entering behavior 
students receiving visuals possessing low levels of stimulus 
explicitness would experience a reduction in their 
probability to attain as high an achievement score than 
if they had received the more optimal instruction posses- 
sing visuals with the higher level of stimulus explicitness. 


The data collected for this study did not support these 
assumptions in Salomon’s theory of stimulation. More 
specifically, an analysis of the findings failed to produce 
any significant interactions between the stimulus- 
explicitness level in visuals and the entering behavior of 
students on the criterion tasks. 
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Conclusions 


Three major conclusions can be made with some degree 
of confidence concerning the experimental problem. 

1. For students identified as having either high or 
low entering behavior there were significant differences 
between their mean scores on each of the post-criterion 
tests. Though one could argue that these differences 
are valid because the groups possessed significant differences 
in entering behavior relevant to the instructional material, 
other interpretations seem to be warranted. Analyzing 
these significant differences in terms of Salomon 's theory 
of stimulation, it would appear that the two treatments 
were not effective in increasing a student's probability 
for learning in the direction predicted by the theory. That 
is to say, students receiving the different instructional 
treatments did not demonstrate different performance 
levels. These data seem to contest Salomon’s theory of 
stimulation and simultaneously support Dwyer’s 


research (1) which contends that reality can be edited 
for instructional purposes. 

2. In regard. to the two treatment groups, there were 
no significant differences between the mean scores of the 
first treatment group (those receiving visuals having low 
stimulus explicitness) as measured from each of the post- 
criterion tests. This could be interpreted to mean that 
visuals pos 


sing either of the two levels of stimulus 
explicitness were equally effective in enhancing achieve- a 
ment on identical objectives for each of the entering behav 
ior groups 

3. In regard. to the presence of interaction effects 
between entering behavior and instructional treatment, 
the data showed that there were no systematic effects 2 
on performance on the criterion tests due to a ae oe 
of a particular entering behavior with a particular metho 
of instruction. This conclusion also contests that segment 
of Salomon’s theory postulating that this form of inter- 
action should occur. 
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COVARIANCE AND DISCRIMINANT ANALYSIS 


CARL J. HUBERTY 
University of Georgia 


"group analyses are discussed 
discriminations, 


related measures available 
populations. The 
to that of multipl 
criterion variable 


» to one of two well-defined 
formal equivalence of this technique m 
* regression analysis with a dichoto on 
is well known, This equivalence has 


= 
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shown to extend to the related problems of variable 
deletion and interpretation of discriminant functions. 
Collier (1) showed that tests used in deleting variables in 
regression analysis and in classification are equivalent, 
while Huberty (7) showed that predictor variables may be 
equivalently ordered (with respect to contribution to 
discrimination) by univariate F-ratios and by estimates of 
predictor versus discriminant function correlations. The 
related problems of deleting variables and interpreting 
discriminant functions may be considered a part of the 
important problem of selecting the best subset of pre- 
dictors for optimal classification. These partial solutions 
do provide a basis for variable selection procedures; 
attempts have been made to generalize the predictor versus 
discriminant function correlation idea to the k-group case. 
The primary purpose of the present note is to show that 
a test of the equality of the distance between two pop- 
ulations when based on p predictors and the distance when 
based on q predictors (q < p) is equivalent to a test of the 
equality of the two population mean vectors (or centroids) 
using multivariate analysis of covariance (MANCOVA) with 
p- q variates and q covariates. The first-mentioned test is 
that of Rao (10: 482), and may be expressed as a test of 


2 . H 
where A is the generalized distance function between (the 


centroids of) two populations based on p variables. 
Mahalanobis’s (8) distance (squared) between the two pop- 
ulations as estimated from the sample on the basis of the 


p variables is 


= 
pad l d Uu] 
p N,+N, -2 


where N, = size of sample (or group) j , 


d = (p X 1) vector of differences between means on the 


variables for the two samples, and ; 
W = (p X p) within-groups sums of squares and cross 


products (SSCP) matrix. 
The test statistic used is: 
2 2 
i N, N, (D, - Pq) à 
(N, * N,) (V, +N, J 2)*N, N, D, 


N,*N,-p-l 
P-q 


[21 


which may be referred to the F distribution with p- q 


and Ni + Nz ~ p - l degrees of freedom under H with 
the usual assumptions. 
The statistic used in testing 


H, + a, =H, 


using MANCOVA, as given by Rulon and Brooks (11:94), 
is: 


2 
N N, (N, TM 2)Dj, / N, *N,- 4-2) 
(QN, +N) (N, *NQ-2)4N, N, D; ] 


N; *N,-p-1 
P-q [3] 


where D is the adjusted distance (as will subsequently 
be explicitly defined). The referrent distribution for this 
statistic is the same as that for [2]. (It is noted here that 
[3] is a transformation of Hoteling’s T? statistic with co- 
variance adjustment.) The purpose of this note, then is to 
show that [2] and [3] are equal. 

To begin with some notation, one may consider the 
(p X p) supermatrix of SSCP: ' 


where 
Wii =(P- q] X [p - q]) within-groups SSCP 
matrix of the p - q variates, 
W,, = (q X q) within-groups SSCP matrix of the 
q covariates, 
Wi» = ([p - q] X q) within-groups cross products 
of the p - q variates and q covariates, and 


=W' 
Woy V 


Furthermore, let 


- = 4 
RAE, Py Was Wy [4] 


which is a (p - q) X (p - q) matrix. 
2 

The distance (squared) D | in [3] as defined by Rulon 
pla 


and Brooks (11:94), using the above notation, is 


ad 7 j 2 a 
Ny HN, -—q-32 


——— 
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where d - (p- qX 1) vector of anas used à 
adjusted variable means. Now, for thep- q variates an 
q covariates, the (p -q X q) matrix of regression 
weights is 

=j 

B-W,,W,, 
E - y 

Thus, d= d, -W,, W,, d 


where 


d, - (p- q X 1) vector of mean differences between 
the two groups on the p - q variates, and 

d, = (q X 1) vector of mean differences between the 
two groups on the q covariates, 


Since in general (A/k) ! = kA! where A isa nonsingular 
matrix and k is a scalar, the numerator of the fraction 
on the left of [3] may be expressed as 


-1 
MN O, +N, 22), - 0, Wys a y 
-1 
M! (d, - W,. W,, d) 
or 
-1 
N, N, (N, +N, ~ 2) (d; ~d) Wo, Wi) 
-1 ~1 
M d, - v. v 4,) [5] 


To prove what we seto 
show that the num 
[2] is equal to [5]. 

Now, it may be noted that 
written as 


ut to prove, it is sufficient to 
erator of the fraction on the left of 


this numerator in [2] can be 


-ü - 
N, N, (N, VN, > 2) [d' W d-d V, d,] 


By partitioning d' as [d; : d | and considering the in. 


verse of a supermatrix (5:469), d' w'a may be expressed 
as 
=j | 
=i 
: EN... i M WW, 
PRI dg a eae eos 
22 "21 I v. +R Z Pai 129, 
d, 
d, 


where M is defined by [4]. Thus, we have that 


— 


1 =J "E 
d'W d= [a = dj V, WM 


-1 Í " E W 
td, M 12" 22 
a d 

Ma J !y "i i, 
+40, W, M Wy a, 


| Ru. d es jy 4 
=[d,M d; -d, W,,W,,M d -dM W,,W,,4, 


-1 =i y 71 
+d, WW, M W, W, d,] 


And, therefore, 


7 T. =y mi 
dW a= d; Pad, = d; M d, = d; Wo Pos M d, 


‘oe Wa saw "m W'p m'a 
-d M Wi, 22 5, a sait an 12 22 2 


which simplifies to the triple matrix product in [5]. In 
essence, then, what has been shown is that the total 
distance function (D?) can be written as the sum of the . 
distance function for the covariates (D?) and the distance 
function for the main variates after adjustment for the 

^ 2 we 
covariates UE l. e., 


E D D 
= + 
D, q “pla 


In terms of multivariate tests of hypotheses, what has 
been shown is the formal equivalence of (a) a test of the 
equality of distances between (the centroids of) two pop” 
ulations before and after some of the original variables 
have been deleted, and (b) a test of the equality of two 
population centroids using a covariance analysis of the 
questionable variables with the significant variables as 
covariates. The common statistic yields an indicator of 
whether or not there remains any appreciable useful in- 
formation in the data after the significant variables have 
been removed, 
There is a limited practical implication as well as a 
theoretical implication of the proven result. It is not 
necessary that a researcher interested in deleting 
variables execute two runs of a discriminant analysis 
program to obtain D? yal 


ues, and then calculate the 
Statistic yielded by [2] t 


computer centers, (Re 
using different Subsets.) 


ore important, however 
theoretical im 


i s 5 e 15 
variate analysis The following procedure T 


by Hall (4: 8.9). 


HUBERTY 5 


1. Select the variable with the highest univariate 
F-ratio and rerun the analysis using this variable as a co- 
variate. 

2. From the results of the reanalysis (a MANCOVA), 
select from the remaining variables that variable which 
yields the highest univariate F-ratio. Assign it as an ad- 
ditional covariate, and reanalyze. 

3. Continue steps 1 and 2 until the multivariate F-ratio 
shows no significant variation (say, p < .10) among the 
means of the remaining variables. 

This procedure, which probably tends to overestimate the 
number of significant variables, could be modified bya 
stepwise procedure (4:9). Cramer and Bock (2: 607) 
indicate that such a use of MANCOVA to delete variables 
is implicit in Rao's work (10) as a generalization of the 
two-group case. 

MANCOVA was used by Horton, Russell, and Moore 
(6) for selecting the most effective discriminators; the 
method is similar to that of Hall except that it involves 
more analyses to determine the significant variables. The 
procedure used begins with the smallest subset of variables 
(i. e., one) and then, after testing all possible combinations 
with the complementary subset, selects that combination 
which left the smallest residual, as indicated by the small- 
est value of a test criterion (a likelihood ratio statistic). 

If the value of the criterion of the selected subset is 
significant, an additional variable is included in the subset 
of variables to be retained using the same procedure as 
before. This cycle of operations is terminated when no 
significant residual between-group variance remains. In 
the 12-group situation of this study, five out of nine 
original variables were retained. 

In addition to selecting a good subset of discriminators, 
the use of MANCOVA also provides the rescarcher with an 
ordering of the variables with respect to contribution to 
discrimination among the criterion groups. This may be 
helpful for interpretive reasons, especially in comparing 
results across studies involving similar ends sd rper 
may be said when discriminant functions are obtained an 


interpreted following the rejection of an hypothesis 
for main or interaction effects—in a factorial multivariate 


e E ivariate 
analysis of variance. In such a case, follow-up univa 


analyses may be considered—with possible adjustments in 


i i i re than one discrim- 
nominal a-levels—and, assuming ya EM 
inator is "significant," it may be well to carry 


MANCOVA in interpreting the additional contribution of 
succeeding single discriminators. 

Some comments about the use of MANCOVA in var- 
iable selection may be made. First of all, no claim can be 
made that such a selection procedure would yield the best 
subset of the resulting size. What would be necessary, of 
course, is to examine all possible subsets of a given size. 
Secondly, a “forward deletion” or step-up procedure as 
described is open to criticism, in the sense that after a 
deletion the analysis is carried out on the “bad” variables, 
and it can be said that in so doing much of what is being 
analyzed may be “noise.” Proponents of such a view 
would prefer a "backward deletion" or stepdown pro- 
cedure in which the bad variables are discarded from the 
analysis at each step. This issue also arises with variable 
selection in multiple regression analysis; Mantel (9) points 
out other advantages of a variable selection scheme in 
which the variables are successively discarded one at a 
time from the original full set. 
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THE STABILITY OF TEACHER RATINGS ON 


THE DEVEREUX ELEMENTARY SCHOOL 


1,2 
BEHAVIOR RATING SCALE 
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ABSTRACT 


AN INCREASING BODY of rese 
(4-7), suggests that the Devereux E] 
ior Rating Scale (3) constitutes a Promising technique for 
understanding the learning and' behavior patterns of 
children referred for syefíological assessment. The 
Devereux is compen 47 behavioral items which are 
relevant to classroom achievement and/or adjustment. 
Three items—Unable to Change, Quits, and Slow Work— 
are scored singularly, but the remaining items are combined 
to obtain scores for the following behavioral factors: Class- 

übe; Disrespect-Defiance: Exter- 
nal Blame; Achievement Anxiety; External Reliance; Com- 


prehension; Inattentive-Withdrawn; Irrelevant Responsive- 
ness; Creative Initiatiye; and 


As noted by Littell (2:13 
Devereux were selected and 
only limited reliability data 
Spivack and Swift (4:30.33 
estimates for a subsample o 
again one week after their į 
estimates obtained for this 
indicated by a median reliability coefficient of .87 for all 
eleven factors and coefficients for individual fac 
which ranged from .85 through 91. 
the reliability coefficients of 72, 80 
individual items were somewhat low 
eleven factors. 


HEEL , 


arch data, for example 


7), the items comprising the 
grouped with great care, but 
were included, Specifically, 

) provided test-retest reliability 
f 128 children who were rated 
nitial rating. The reliability 
group were satisfactory, as 


tor scores 
As might be expected, 


), and .71 for the three 
er than those for the 


FRED H. WALLBROWN 
Columbus, Ohio Public Schools 


ementary School Behav- 


Need for Closeness to Teacher. 


JOHN BLAHA " 
University of Detroit 


e three single items compared favorably with 


The usefulness of the Devereux as a technique for the - 
diagnosis and remediation of learning and behavior ie 
is contingent upon the stability of scores across relative A 
long periods of time. Consequently, the present study was 
designed to investigate the long-term reliability of 


gs 
. : : on clas 
Devereux scores for primary grade children in an oper 

room setting. 


Method 


Subjects 


The final sample was comprised of 67 children (35 e 
32 girls) who were enrolled in the primary unit of a subt 
ban elementary school from May 7, 1973, through -— 
May 17, 1974. The total first grade enrollment was 79 à 
the time of the initial ratings; however, the sample Was 
reduced to 67 by the time the final ratings were obtaine® 
near the end of second grade. 

The majority of the Ss w 
and were from u 
the father's occu 
That is, the 


‘sent? 
ere above-average in intelli 
pper-middle-class families as indicated PY 
pational status and educational level- 
median educational level for fathers was 
college graduation, and most of them were employed x 
in professional or managerial positions. The mean IQ for 


the final sample on the Cognitive Abilities Test—Primary " 
Form 2 (8) was 114.7. with the SD of 13.8. 
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Data Collection 


The school from which the sample was obtained is 
organized in accordance with the open classroom concept 


and also provides for vertical grouping at the primary level. 


With this arrangement, each of the eight teachers in the 
primary unit had a class comprised of first-, second-, and 
third-grade children. Consequently, eight teachers were 
involved in completing both the initial and final ratings for 
the sample. 

The initial ratings were completed during a five-day 
period in the spring of 1973, and the second set of ratings 
was completed approximately one year later during a simi- 
lar time period. The eight teachers who completed the 
initial ratings were provided with a one-hour training . 
session one weck before the rating period. A similar train- 
ing period was held before the final rating period to 
familiarize the two new teachers with the rating procedure 
and to review the procedure for the six teachers involved in 


the initial ratings. 


Data Analysis 

Test-retest reliability estimates were obtained by com- 
puting product-moment correlations between the initial 
and final ratings for each of the eleven factors and three 
individual items. Standard errors of measurement were 
computed using the formula provided by Horst (1:294). 


Table 1.—Reliability Estimates and Other Statistics for Devereux Scores 


Devereux Score 


FACTOR: 


Classroom Disturbance 
Impatience 
Disrespect-Defiance 
External Blame 
Achievement Anxiety 
External Reliance 
Comprehension 
Inattentive-Withdrawn 
Irrelevant Responsiveness 
Creative Initiative 

Need for Closeness to Teacher 


EXTRA ITEMS: 


Unable to Change 
Quits 
Slow Work 


i 


Initial 


Results 


The reliability coefficients for the fourteen Devereux 
scores (eleven factors and three items) are presented in 
Table 1, along with other relevant statistics. The standard 
error of measurement (SEM) for each score is included 
since this information is most important for the interpreta- 
tion of Devereux ratings for individual children. The means 
and standard deviations for both the initial and final 
ratings are also included in Table 1. These data are 
important in that they establish the characteristics of the 
present sample and facilitate comparisons with other 
samples. 


Examination of the r’s reported in Table 1 suggests 
that, for children in an open classroom setting, some of the 
Devereux scores are much more reliable than others. The 
median r for the eleven Devereux factors was .73, while 
the r’s for individual factor scores ranged from .49 through 
-86. The reliability estimate was highest for Comprehension 
with an r of .86; Disrespect-Defiance ranked second with 
an r of .82. The reliability estimate for Achievement 
Anxiety (r = .49) was substantially below that obtained 
for any other factor, but Impatience (r= .62) and 
Inattentive-Withdrawn (r = .67) were also relatively low. 
The median reliability estimate was the one obtained for 
External Reliance (r = .73). The reliability estimates for the 
other factors were closer to the median, with Irrelevant 


Bureau of Ednl, & Psyl, | 
(S. C. E. R. T.) 


Statistic 


Final 
SD X SD 


mas. MN 


49 144 47 733 35 
34 148 34 86 18 


36 120 36 .78 24 
45 156 34 50 341 


ete seco ERE Ex Ei 
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Responsiveness (r = .79): Creative Initiative (r = .78); and 

Classroom Disturbance (r = .75) somewhat above, and 

External Blame (r = .71) and Need for Closeness to Teacher 
= 70) slightly below. 

v Vd e estimates for the three individual items 
compare favorably with those for the eleven factors even 
though the factor scores are obtained by summing the 
ratings for several items. In fact, the reliabilities for Unable 
to Change (r = .82) and Quits (r = .74) were both above 
the median r for the eleven factors, and the reliability for 
Slow Work (r = .72) was only slightly below. 

The SEMs reported in Table 1 provide concrete estimates 
of the extent to which raw scores on the Devereux tend 
to differ across a one-year time interval. One cannot 
determine from the SEMs to what extent score variability 
is the result of measurement errors and how much is 
attributable to systematic behavioral changes. However, 
the SEMs probably provide relatively accurate estimates of 
overall stability of Devereux scores across this time 
period. For example, the SEM for Classroom Disturbance 
is 3.0, which indicates that the chances are about 1 in 3 
that one would obtain a random score change (increase or 
decrease) of 3 points or more. Similarly, the chances are 
about 1 in 20 of obtaining a score change of 6 points 
(2 SEMs) on the basis of random error. The SEMs for the 
remaining scores can be interpreted in a similar manner. 


Discussion 


Generally speaking, the reliability estimates discussed 
above are substantially lower than those reported by 
Spivack and Swift (4) for a one-week time interval. This 


overall pattern of differences is understandable since the 


test-retest interval for the present Study was much longer, 
i.e., one year. Ye 


t, determining the adequacy of long-term 
reliability estimates poses a real problem since, as noted 
by Horst (1:289), there is very little known about the 
relationship between reliability and the time interval 
between retests. Under these circumstances, 
only offer a tentative discussion of the reli; 
and suggest that persons using the Devere 
children should consider scores individual 
their SEMs. Such an approach is necessa 
reliability estimates rang: from Compre! 
good reliability, through Achievement 
poor reliability. 


one can 

ability data 

ux with individual 
ly in terms of 

ry because the 
hension with ye 


Anxiety with very 


In the final analysis, those individuals c 


a onsidering the 
Devereux for either group or individual us 


e should 


evaluate its reliability in terms of their own unique situation 
and needs. Effective evaluation necessarily involves com- 
parison of Devereux reliabilities with those for other rating 
scales, as well as consideration of the overall state of 
measurement in the area. 

Those individuals interested in the reliability of the 
Devereux should also bear in mind the nature of the pres- 
ent sample and the specific type of curricular arrangement 
existing in the school where the ratings were obtained. 
Strictly speaking, the extent to which these reliability 
estimates can be generalized beyond the present sample 
is an empirical matter to be resolved through future 
research. However, it seems reasonable to surmise that 
these estimates would be most applicable to bright 5s 
from suburban schools with an open classroom arrange- 
ment. In contrast, one would not nec sarily expect these 
estimates to describe the reliability of the Devereux for 
Ss in a traditional classroom setting. 


NOTES 


1. Appreciation is due Mr. Ronald Hopper, principal of it 
Worthington Hills School, and the teachers in the primary un 
for their full cooperation during the conduct of the study. - 
2. Reprint requests should be sent to Dr. Jane D. Wallbrown: 


Worthington City Schools, 55 East Stafford Avenue, Worthington: 
Ohio, 43085. 
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CREATIVITY TRAINING IN ELEMENTARY 


SCHOOLS IN BRAZIL 
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ABSTRACT 


JOHN F. FELDHUSEN 
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The effects of the Purdue Creative Thinking Program (PCTP) on the creative abilities of elementary school children in an underdevel- 
oped country were evaluated in an experiment with 578 Brazilian fourth- and fifth-graders. At each grade level, twelve classes were 
assigned to each of two treatment conditions (PCTP with reinforcement and PCTP without reinforcement) and a control group which 
had no exposure to PCTP. Pre- and post-testing with the Torrance Tests of Creative Thinking (TTCT) yielded twelve creativity measures. 
Usinga3 X 2 X 2 (treatment by sex by grade level) analysis of covariance, the creativity training was found to be effective, but rein- 


forcement of pupil performance appeared to have a decremental effect. 


WHILE THERE HAS BEEN a great deal of research on 
creativity in the United States (16) and a number of other 
countries (17), few studies have been conducted on the 
creative abilities of children from underdeveloped coun- 
tries. Evidence from a number of sources (4, 10, 18) indi- 
cates that all children possess some creative potential. 
However, it is essential that the home and school provide 
conditions and instruction to help children realize their 
full potential. Schools in the United States devote much 
attention to creativity. Creativity should also be stressed 
in the schools of less well-developed nations. Through the 
development of creative thinking abilities in their children, 
these nations can make more rapid progress in the next 
generation. » 

In order to facilitate the development of creativity, a 
number of methods and programs have been designed. 
Two of the most widely researched and evaluated pro- 
grams for elementary school children are the Productive 
Thinking Program (PTP), developed by Covington, Crutch- 
field, and Davies (1), and the Purdue Creative Thinking A 
Program (PCTP), developed by Feldhusen et a = i 
programs are designed to strengthen cognitive ski s which 
are central to the creative process and to provide experi- 
ences in creative thinking. Davis (3) reviewed methods 
and programs for teaching creative thinking and concluded 
that there are a number of successful methods and pro- 
grams for teaching creative thinking. , 

The main purpose of this study was to determine 
whether the creative abilities of Brazilian children of ele- 
mentary school age could be increased through the use 


of the PCTP (6). 


PCTP involves 28 audiotapes and a set of three or four 
exercises for each tape. The taped program consists of a 
3- to 4-minute presentation designed to teach a principle 
or idea for improving creative thinking, and an 8- to 10- 
minute story about a famous American pioneer, states- 
man, inventor, or researcher. The exercises contain printed 
directions and problems, or questions, which are designed 
to provide verbal and nonverbal practice in originality, 
flexibility, fluency, and elaboration in thinking. 


Previous research with the program (5, 6, 8, 15) sug- 
gested that it is effective in improving creative thinking 
skills as measured by the Torrance Tests of Creative 
Thinking. Feldhusen et al. (5) studied the effects of the 
PCTP on the creative thinking abilities of children from 
third, fourth, and fifth grade, and found substantial gains 
in creative thinking abilities after 28 weeks of training. In 
a second study, Feldhusen and his co-workers (6) evalu- 
ated the effectiveness of the different parts of the PCTP 
in a sample of fourth-, fifth-, and sixth-grade children. 
Although the program or its components were not effec- 
tive at all grade levels or for all criterion variables, there 
was considerable evidence for the overall effectiveness of 


the PCTP. Speedie, Treffinger, and Feldhusen (15) studied 
the long-range effects of the PCTP with a sample of upper 
elementary children and found results quite similar to 
those of Feldhusen et al. (6). Similar results were obtained 
by Shively et al. (14) in a study comparing the effective- 
ness of the Productive Thinking Program (1) and the PCTP 
in a sample of fifth-grade children. Feldhusen, Speedie, 
and Treffinger (8) summarized research involving the PCTP 
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P * 
Table 1.—Significant F Ratios for the Analyses of Covariance of Creativity Test Scores 


Figural 
(L = Lines, PC = Picture Completion) 
Fluency Flexibility 


Source of 
Variation df 


Originalit 


tment 2 13.49 10.22 19.22 

= (411.56) (0.43) | (124.18) (0.62) (1261.45) 25.56 

Sex 1 8.51 
Individual (259.58) (1.60) (14.35) (2.62) (13.01) (0.03) 
students 

Grad 17.72 

“ (17.23) (22.13) (2.25) (7.50) (8.64) (47.81) 
Residual (30.49) (1.24) (12.14) (1.94) (65.62) (9.87) 


Treatment 2 6.38 6.00 11.67 Bae 
(511.29) (143.56) (0.17) (1369.14) (26.01) 
Classes with 
Classes -in treat- 
ments 2.04 
(23.92) (117.35) (4.23) 
Subjects 
within 
classes 520 (11.70) (63.90) (10.30) 


*p<.01 
Mean squares in parentheses 


Verbal 
Sampli (U = Unusual Uses, PI = Product Improvement) 
ampling Source of Flexibilit Originalit 
Unit Variation df 


Treatment 


(0.25) | (413.76) (62.72) 


Individual SER 
students (5.21) (52.87) 
Grade 
(141.97) (0.90) (4.93) 
Residual 
sidua (30.71) (21.36) (21.85) 


Treatment 


10.71 5.47 
T (0.20) (473.96) (72.68) 
Classes with 
Classes "in treatments 20 - 
0549 | (14.45) (2.45)} (44.14) (13.28) 
Subjects 
within 
classes 520 
ais) G Gay! Gase dia 
^p « 01 


Mean squares in parentheses 
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and concluded that it is an effective program for grades 
three to six. 

Previous research (7, 11) using reinforcement to encour- 
age thinking indicated that written verbal comments on 
children's creative productions would be motivating to 
the children and would increase their fluency and origi- 
nality. However, the stress placed on the avoidance of 
evaluation in creativethinking by Osborn (12) raised some 
doubts about the possible effects of this variable on crea- 
tive thinking. 

In the present study, fourteen of the twenty-eight 
Stories of the PCTP and the corresponding exercises were 
used with a sample of children in Brazil. The choice of the 
fourteen dramatized stories was based on their relationship 
to the program of history and social studies in Brazilian 
schools. The programs were translated into Portugese by 
the first author. 


Method 


Sample 

A total of 578 fourth- and fifth-grade children from 24 
classes in both private and public elementary schools in 
Brasilia, Brazil, participated in the study. There were 
twelve fourth-grade and twelve fifth-grade classes, with 
eight classes assigned to each of two treatment conditions 
(program with reinforcement of the pupils? performance 
on the creativity exercises and program with no reinforce- 
ment of the pupils’ performance on the creative exercises) 
and eight classes assigned to the control group condition. 


Procedure 

Before instruction began, two verbal sub-tests (Unusual 
Uses and Product Improvement) and two figural sub-tests 
(Circles and Picture Completion) of the Torrance Tests of 
Creative Thinking (TTCT), Form B (16), were administered 
as pre-tests to all pupils in both the experimental and con- 
trol groups. The tests were translated into Portugese by 
the first author. The instructional material was then admin- 
istered to the experimental groups by the teacher once a 
week for fourteen consecutive weeks. The teachers were 
taught how to use the material by the first author. In 
administering the program, the teacher read the introduc- 
tion and the story to the children since tape players were 
not available. The pupils then worked on the printed exer- 
cises. In one experimental condition (program with = 
forcement), the children’s completed exercises were eval- 
uated by the experimenter. She wrote encouraging l 
comments on their papers intended to reinforce fluency 
and elaboration (e.g., very good; good; good, but try , 
harder; try harder), and then gave them ba^. v the children. 
Pupils in the other experimental condition (program with 
no reinforcement) received no reinforcement. Pupils in the 
control group received no creativity training. 

At the conclusion of the series of fourteen programs, 
two verbal (Unusual Uses and Product Improvement) sub- 


tests and two figural (Lines and Picture Completion) sub- 
tests of the TTCT, Form A, were administered as post- 
tests to all pupils in the experimental and control groups. 

A 3X 2X 2 (treatment by sex by grade level) analysis 
of covariance was used to analyze pupil performance on 
each of the twelve creativity measures: figural fluency, 
flexibility, and originality for the sub-tests of Lines and 
Picture Completion; and verbal fluency, flexibility, and 
originality for the sub-tests of Unusual Uses and Product 
Improvement. Previous research by the authors indicated 
that the creativity sub-tests were task-specific and should 
be analyzed separately. The covariates for the divergent 
thinking measures were the respective TTCT pre-test meas- 
ures. Post hoc individual comparisons between adjusted 
means were made for significant effects using the Newman- 
Keuls procedure. Further analyses of covariance were 
carried out to analyze the effect of treatment using the 
class as the sampling unit. Alpha was set at .01 for all tests 
of significance. 


Results 


The results of the analyses of covariance are summar- 
ized in Table 1. Using the individual subject as the sampl- 
ing unit, a consistent finding across all dependent variables 
was that no interaction effect reached statistical signifi- 
cance. The main effect of treatment was significant for all 
three creativity dimensions of fluency, flexibility, and 
originality for the Lines and Unusual Uses sub-tests, but 
here the treatment effect was also significant for figural 
originality on the Picture Completion sub-test and for 
verbal originality on the Product Improvement sub-test. 
The effect of classes within-treatments was significant for 
figural fluency on the Lines and Picture Completion sub- 
tests, for figural flexibility on the Lines sub-test, and for 
verbal originality on the Unusual Uses sub-test. The signifi- 
cant classes-within-treatments effect indicates differences 
among classes in the effectiveness of the program. 

The adjusted means for experimental treatments for 
each of the dependent variables are presented in Table 2. 
The Newman-Keuls analyses revealed that the differences 
between control and treatment conditions, as measured 
by the Lines and Unusual Uses sub-tests, were significant 
for all creativity dimensions, with treatment means being 
greater than control means in all instances. The compari- 
sons between the two experimental conditions, program 
with and without reinforcement, revealed that the differ- 
ences were significant for figural fluency, flexibility and 
originality, and for verbal fluency. In all instances, includ- 
ing the two dependent variables for which the difference 

was not significant, the means for the non-reinforcement 
condition were greater than the means for the reinforce- 


ment condition. 


Discussion 


The results of this research confirm the effectiveness 
of creativity training in another culture, especially with 
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Table 2.—Adjusted Means for Creativity Test Scores 


Figural 


Experimental 
Condition 


Fluency Flexibility 


Reinforced 16.31 


9.40 


Not 
reinforced 17.94 9.46 
Control 14.45 9.42 


children in a relatively underdeveloped educational envi- 
ronment. Memorization, respect for authority, 
rigid teacher control are stressed in Brazilian schools, and 
classes are large. Such conditions might be expected to 
make children impervious to the effects of a creativity 
training program. Furthermore, the teachers we 
unfamiliar with the basic a 
Thinking Pro 
their own ideas, to fe 
to be original. In spit 
teachers faced, the ga 
substantial. Whether 
was a teacher effect i; 
The results also in 
reinforcement on th 


and quite 


reinforcers, in the form of brief 
ment learning, as Page (13) found true with eleme 
school children. However, reinforcers in 
comments written on papers may e 
evaluated and attendant fears oran 
Osborn (12), Wallach and Kogan ( 
and Hobson (9), freedom from co 
judged or evaluated may be the b 
not only immediate creative pro 
range learning of creative thinki 
seems to be confirmed by the results of this rese 

The Torrance Tests have been criticized as n 
well-established validity, and particularly as no 
clearly established relationship with real life 
duction (2). However, Treffinger (19) revie 
research related to creative thinking and co 
there is substantial predictive and concurre 
referenced validity for creativit 

Thus, it seems like 


comments, should aug- 


ntary 
the form of brief 
Voke a sense of being 
xiety. As suggested by 
20), and by Feldhusen 
ncern about being 

est environment to foster 
duction but also longer- 
ng. The latter Proposition 
arch, 

ot having 
thaving a 
Creative pro- 
wed the 
ncluded that 
nt criterion. 

y measures, 


ly that creativity training can he 


(L = Lines, PC = Picture Completion) 


Originality 
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Verbal 


(U = Unusual Uses, PI = Product Improvement) 


Fluency Flexibility Originality 


8 
17245 13.51 5.37 8.3 
20.20 — 12.28 5.64 7.30 
81 
14.52 12.79 2.32 7 


RH SE av be particu- 
the Purdue Creativity Training Program may be parti 
} “re. e 
larly effective, and that the Torrance Tests of Creativ m 
Thinking can be used as one set of criterion measures. 


is hoped that further research will be conducted to estab- 


p" ae heey ' these results 
lish the generalizability and applicability of these re D 
in children’s lives outside of school and in later stages 
their lives. 
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HEURISTICS FOR CLASSROOM DESIGN 
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ABSTRACT 


Educators have long debated the question of optimal social design for the classroom. Particularly controversial have been the issues of 
leadership style and structural design of the classroom or learning environment. Perhaps the best answer to the question, "What is the 
‘optimal’ social design that can be applied in the classroom?” is, “It all depends." The purpose of this paper is to present a set of 
heuristics which provides a systematic approach for developing social designs for classroom learning under various contingencies. 


THE APPROPRIATE STARTING POINT for any 
approach to developing social designs for classroom learn- 
ing is the educational objective. It is generally accepted that 
intellectual growth is a primary objective of education. 
Intellectual growth includes: (a) learning new information; 
(b) new applications and/or techniques; and (c) motivation. 


The introduction of new information is generally 
technical in nature. Introduc- 


t of as descriptive or 
iii pili designed to provide this form 


tory-level courses are usually 


of intellectual growth. . ; i 
Learning new applications or techniques often entails 


the utilization of principles or concepts previously id 
Examples of this form of intellectual growth include case 
studies, problem solving, and laboratory research. 

Motivation is somewhat more abstract. Motivation may 
be viewed as stimulation of thought and ideas which 
generates interest for further intellectual growth in an area 
or field. i 

Any one or a combination of these forms of p 
may be the objective of a specific academic course. i he M 
purpose of this paper is to present an approach whic’ h may 
be used to determine the most appropriate social design 
for achieving these forms of growth, given certam input 
variables. 


Inputs 


Two input variables are considered: (a) the belief system 
of the students; and (6) their previous educational back- 
grounds, 

Several researchers have demonstrated that it is possible 
to categorize individuals based upon their belief system. 
The work of Rokeach (3), DiRenzo (2), and Stern, Stein, 
and Bloom (5) can be consolidated to characterize dogmati 
and non-dogmatic students as follows: 


Dogmatic Non-Dogmatic 
Non-Authoritarian 
Open-minded 

Flexible in opinions and 


Authoritarian 
Closed-minded 
Rigid in opinions and 


beliefs beliefs 
Low tolerance toward High tolerance toward 
others others 
Inconsiderate of others Considerate of others 
Conservative Liberal 


Highly personalized relation- 
ships with others 

Outgoing 

Above average intelligence 

Excels in loosely structured 


Depersonalized relation- 
ships with others 

Inhibited 

Average intelligence 

Requires highly structured 


environment environment 
Prefers lecture method of Prefers discussion method of 
instruction 


instruction 


BE 
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Prepares for exams by . 
memorizing main points 

Prefers objective tests 

Participates about average 


Prepares for exams by trying 
to understand concepts 

Prefers essay exams 

Active participant in class 


; discussions 
in class hiefly a contributor of 
d zi d Is chiefly a 
Is CREW ete an ideas and concepts 
ev. 


Vocational interests in 
engineering, physical 
sciences, law, 
accounting, etc. 


Vocational interests in 
social sciences, 
humanities, teaching, etc. 


While it is not the objective of this paper to discuss 
methodologies for classifying classroom groups in terms of 
dogmatism, it is worthwhile to point out that this is 
possible, and proven tests for this determination are 
available. Examples include the Rokeach Dogmatic 
Scale, the California F Seale, the Gou 


gh-Sanford Rigidity 
Scale, the Opinionation Scale, and the Ethnocentric Scale 
(4). 


indeed 


The second input variable, 
ground, refers 
level of attain 


previous educational back- 
primarily, but not exclusively, to the 
ment an individual has reached in a given 
area or field. More will be said about this variable later. 


Social Designs 


The social design of a classroom learning situation is 
primarily composed of two variables, the leadership style 
of the teacher and the Structural design of the class. 

Leadership style is generally viewed as a dichotomy. 
The labels vary somewhat but the dichotomous nature of 


styles is prevalent in the literature. Anderson, in a re 
of research concerning the e 


synthesized a large number 
“authoritarian” versus 


view 
ffects of leadership styles, 
of studies which referred to 


“democratic” and “teacher- 
centered” versus “student. 


centered” styles (1), Although 
this construct is ex tremely 


oversimplified, it tends to be 
quite widely applied in analyzi 


students. The structural design of the cla: 


SS can be one or 
a combination of the following: 


l. Lecture—In this case the faculty member a 

the traditional role of teacher, 

- Teacher-led discussion—H 

the faculty member assu 

Exchange of ideas is pri 
nature. 


2 ere, at least conce 
mes the role of a 
maril 


ptually, 
guide. 
y ofa multiple bilateral 


3. Teacher as mentor—Somewhere between teacher-led 
discussion and open discussio 


the faculty member assumes 
Conceptually this would col 
advanced undergraduate- 
Interaction is multilateral 
- Open discussion—Here ag 


but en tirely among colle. 
an obvious 


nisa Situation in which 
the role of a mento, 
mpare favorably with 
or graduate-level se 
by design. 

ain interaction is mu 
agues. The 
absence of a central aut 


r. 


!an 
minar, 


ltilateral 
structure exh ibits 
hority figure, 


O 


5. Independent study —In this case the student is basic 
ally alone in his search for knowledge. 


The descriptive material thus far presented can be 
[ iagram 
easily summarized in terms of the model or flow diagrar 
of Figure 1. 


Inputs Social Designs Intellectual Growth 


Belief structure of 


New information 
the students 


Applications and/or 
techniques 
Motivation 


Previous educational 
background 


Leadership style 
Structural design 


Figure 1.—A model for classroom design 


The model is intended to indicate flow commencing with 
the given inputs and resulting in intellectual growth. " 
Unfortunately, social design is often considered the equ 
lent of the “black box” concept. The conceptual frame- | 
work of social designs is poorly understood by many 
academicians and therefore often applied incorrectly. 

Instead of attempting to hypothesize the effect a jal | 
intellectual growth of each possible combination of rand l 
designs given the possible input parameters, the remain 4 | 
of this paper will be devoted to synthesizing the relevan "nm 
literature in terms of the model, and to developing gener? | 
zations which may assist the te 
design heuristics and policie 
situations. 


; ing social 
acher in formulating soc!4 
s for specific classroom 


The Student 


The literature on individual characteristics strongly 


supports the proposition that a dogmatic individual, 
regardless of previous e 


ducational ex periences, will 
achieve highest in a st 


ructured environment. This pu 
include autocratic (or its semantic equivalent) leaders n 
style and rigidly structured lectures. Whether function! 
at an introductory or advanced level, within or ou tside cels 
of his area of specialization, the dogmatic individual pan 
when course content is explicit and/or technical in na 
Likewise, he dislikes and tends to do poorly in ea 
which he feels have little relevance to his specific care? 
objectives. He views education as vocational prepara le 
lion and is not likely to be interested in subjects outs! 
of his primary field(s). 

Assuming that a “class pe 


it is predominantly dogmati 
apparent. Intelle 
authoritarian lea 


important factors to consider in determining the best 
social design to apply in this situation. 


Previous Educational Background 


There are generally two broad categories of previous 
educational experience. At one extreme are students who 
have had no formal exposure to a particular area of study. 
Students enrolled in introductory courses fall into this 
category. At the other extreme are students who have had 
educational experiences which directly support and 
provide background information for the particular course 
in which they are enrolled. Typical of this group is an 
advanced undergraduate or graduate student taking a 
course in his major or minor area. 


Intellectual Growth Objective 


Given that the student group is non-dogmatic, the 
intellectual growth objective(s) become important criteria 
in selecting the optimal social design. 

In the case of the student with no previous exposure in 
an area, the intellectual growth objective is normally the 
introduction to new information. Most studies suggest that 
if new information is the purpose of a course, an 
"instructor-centered" leadership style is most effective. 
Also most appropriate in this situation is either a lecture 
or a combination of lecture and teacher-led discussion 
structure. Students in this group are being exposed to an 
entirely new environment; they are more at ease and 
learn better when their primary responsibility is to listen 
and absorb the new information in straight-forward , 
explicit terms. This also indicates their expectations in 
the course. They feel that neither they nor their colleagues 
have much to contribute to the group's learning process, 
and thus feel a strong need for a central leader-authority 
to guide them. Their discussions are normally of a multiple 
bilateral nature, with each student interacting only with 


the leader. Other structural designs tend to result in the 
same type of interaction. 

Next, one can generalize using the other end of the 
educational experience spectrum and again look at the 
growth objective through the nature of the course. The 


typical course in this category will be advanced and its con- 


tent less explicit. The course is normally an integral part 
of the student’s curriculum. Therefore, both the students 
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and the teacher have higher expectations. The intellectual 
growth objectives emphasize applications, techniques, 
and motivation. The most effective leadership style 
in this case will be “non-directive” or “student-centered.” 
Applicable structural designs are flexible, from teacher-led 
discussion to open discussion to independent study. 

The generalizations thus far presented are well docu- 
mented in empirical studies. The realization that there 
is some discrepancy as to the optimal structural design in 
a given situation has provided the motivation for many of 
the research projects which formed the foundation for this 


investigation. 


Summary and Conclusion 


The objective of this paper has been to provide a 
systematic approach for developing social designs for 
classroom learning under various con tingencies. The 
recommended procedure is as follows: 


1. Determine inputs 

2. Determine intellectual growth objective(s) 
3. Determine optimal structural design 

4. Determine appropriate leadership style 


It would be a rare case to find the variables discussed 
in this paper perfectly uniform in a classroom situation. 
It is therefore not possible to prescribe absolute formulas 
for classroom success. It is, however, possible to provide 
a systematic approach for analyzing the objectives, para- 
meters, and variables associated with intellectual growth. 

The utility of this or any other conceptual approach 
can only be determined by empirical testing. If it pro- 
vides the framework for a logical decision process in the 
development of social designs for classroom learning, it 
will have achieved the stated objective. 
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ABSTRACT 


paired on the basis of their first examination scores. 
System which provided immediate item-by-i 
manner with no immediate feed 


THE EFFECTS ON PERFORMANCE of the delay of 
feedback in classroom examination situations has been the 
topic of considerable investigation. Some researchers have 
adopted a general position that feedback should come as 
soon as possible after the response due to a negative re- 


lationship between delay of feedback and effectiveness of 


In an attempt to resolve these con 
present research was designed to stu 
mediate item-by-item feedback ve 
multiple choice test performance. The present investigators 
viewed feedback in the more affective terms of reinforce- 
ment, and controlled for several possible sources of con- 
founding, such as item dependency, machine-noye]ty. and 
different room/different instructor effects. It was pre- 
dicted that students receiving immediate feedback would 


flicting findings, the 
dy the effect of im- 
rsus no feedback on 


ERT students 
score significantly higher on the examination than studer 
receiving no feedback during testing. 


Method 
Apparatus 


A Modular EDEX Student Response System was used | 
which allowed the instructor to provide immediate oral | 
feedback for up to 40 students simultaneously. Each stu- 

dent had a response unit with five buttons and could 

respond by pushing one of the buttons A-D or, if he de- 

sired to make no response, could push the blank button- . 
These units were placed on standard classroom desks. The 
instructor's monitor had the capacity to: (a) give the per- 
centage of students making each response; (b) indicate 

the response made by cach student; and (c) record the 

total number of correct responses for each student. i 

Groups using the above apparatus were termed machine, 

while those that did not were termed traditional. 


Subjects 


Eighty-four students from two Introductory Psy- 
chology classes plus thirty students from one Child Psy- 
chology class at Middle Tennessee State University serve 
as Ss. The Introductory students were divided into four 
groups (two from each class) of twenty-one Ss each, and 
the Child Psychology class was divided into two groups 
of fifteen. The lack of another Introductory Psychology 
class necessitated using the Child Psychology class. All ay 
classes were taught by the same instructor on the same 
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Procedure 


Test 1 was given by the traditional (mimeographed) 
method to all students from the Introductory classes, and 
consisted of sixty multiple choice (four alternative) items 
on material covered in the course. On the basis of Test 1 
scores, the students were match-paired into four groups. 
One of the Introductory classes was subdivided into two 
groups: (a) the immediate feedback group which received 
item-by-item feedback during Test 2 and (b)the feedback 
control group which took Test 2 in the traditional manner 
The feedback control group took Test 2 at the same time 
of day as the immediate feedback group but in a different 
toom proctored by a different professor. They received 
mimeographed copies of Test 2 and recorded their answers 
on IBM answer sheets. In another room, the immediate 
feedback group received the same mimeographed 
test, but they recorded their answers on the EDEX System 
and were given feedback following each response. 

Group I Ss from the Introductory class took Test 2 in the 
the traditional manner (using IBM answer sheets) in their 
usual classroom and were proctored by the regular instruc- 
tor. At the same time, Group II from this class took Test 2 
in the traditional manner, but in a new classroom with a 
different instructor. Thus, on Test 2 there was one 
experimental group (feedback) and three control groups 
(feedback control, same-instructor/same-room control, 
and different-room/different-instructor control). In order 
to investigate the possibility of a machine-novelty effect, 
the Child Psychology class was divided into two groups: a 
machine no-feedback group and a no-feedback control 
group. 

Control groups for different-room and different-teacher 
effects are suggested by previous research (3, 5), which in- 
dicates that test performance and learning differences 
may arise between groups due to contextual changes in 
the learning environment, such as the room where original 


and interpolated learning took place. 


Results and Discussion 
The results of the machine-novelty comparison indicate 
that while there was a slight difference in favor of the 


Table 1.—Mean Correct Responses for Introductory Psychology 
Classes Tested 


Old Teacher New Teacher 
Old Room New Room 


Machine 


Machine 
No-Feedback 


Feedback 


Standard 


i 68 
testing 31.37 31.00 31.95 31. 
(8.66)* (8.61) (8.52) (9.07) 
Treatment 
testing 33.26 33.26 38.55 34.18 
(8.64) (8.54) (6.79) (10.04) 
y 21 21 21 21 


* 
Standard deviations in parentheses 


40. 


Number of Items Correct 


e= = = me Feedback 
o NoFeedback 


Treatment 
Testing 


Standard 
Testing 


Test Procedure 


Figure 1.—Comparison of Test 1 standard testing with Test 2 
feedback versus no-feedback testing 


machine no-feedback group over its traditional test ad- 
ministration method control, the difference was not 
statistically significant (t < 1.0). This indicates that any 
differences found in the machine no-feedback group ver- 
sus the traditional test administration method would not 
be a function of the novelty of the machine itself. 

One factor that could have produced a difference on 
Test 2 was that students taking a test in a new room 
proctored by a new professor might have behaved differ- 
ently than if they had been in their regular classroom with 
their regular professor. However, there was no differential 
effect due to teacher-room differences on multiple choice 
test performance (t < 1.0). 

With the effects of both machine and teacher-room 
difference accounted for, it seems reasonable to conclude 
that any difference on Test 2 for the feedback versus the 
three control groups might be due to the effects of im- 
mediacy of feedback. Figure 1 illustrates the effect of 
feedback versus the no-feedback control on test per- 
formance. The mean difference of 4.5 more correct items in 
the feedback group over the three control groups was 
reliable, F(3, 60) = 4.01; p < .025. Table 1 presents the 

means and standard deviations for the 84 Ss in the Intro- 
ductory Psychology course. Using the Newman-Keuls 
method for testing individual mean differences, no differ- 
ences were found among the control groups. 

Students receiving rapid item-by-item feedback on 
multiple choice examination items performed better than 
those receiving no feedback by approximately 7.5%. One 
factor that may have been responsible for the higher 
performance of the feedback group was test dependency 
(i. e., knowledge of the correct answer to one item could 
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je give zay all or part of the answer to a future item). 
have given away p: 


In order to circumvent this potential source of confound- 
ing, five different instructors were asked to independently 
analyze Test 2 and point out any items that were, in their 
opinion, dependent. There were sixteen items which at 
least three of the five instructors judged as possibly having 
some degree of dependence. When these items were ex- 
cluded from data analysis, the previous results were again 
obtained. Thus, one can reasonably assume that the pre- 
vious difference was probably due to feedback and not 
item dependence. 


NOTES 


1. Portions of the data were presented at the meeting of the 
Southeastern Psychological Association, New Orleans, May 1973. 

2. This research was supported in part by the Marquette 
University Committee on Research. 


3. Reprint requests should be sent to R. Stephen Fulmer, 
Bristol Regional Mental Health Center, 26 Midway Street, Bristol, 
Tennessee 37620. 
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To this end studies carried out in the past decade have 
attempted to investigate the personality factors and dimen- 
sions of teachers and school administrators. Research on 
dogmatism has demonstrated that open-minded instruc- 
tional leaders exhibit behaviors which are consistent with 
the goals of providing a democratic atmosphere conducive 
to learning by inquiry (1). Further, other researchers have 
presented evidence to suggest that open-minded individuals 
have the requisite attitudes that characterize effective 
teachers and administrators (4-6). 

Recently, human relations training has been promoted 
as a promising method of reducing dogmatic attitudes 
(2,8,9). In these studies the reduction of dogmatic atti- 
tudes has been accomplished through group experiences. 
Exposure to human relations training can help in pro- 
moting desirable attitudes and is consistent with some 
theories of educational administration (3). 

The purpose of this study was to ascertain whether 
dogmatism of prospective school administrators coming 
largely from the Middle and Far East can be modified 
through exposure to a group experience in human rela- 
tions. Additionally, the study was designed for a relatively 
short period of time to test whether changes in attitudes 
can be accomplished quickly. 


Method 
Subjects 

The Ss used in this study were three female and thir- 
teen male graduate students registered for the Sprin 

gn gl pring 

1974 semester in a program leading to a master's degree 
in Educational Administration at the American University 
of Beirut. The Ss were randomly assigned to either the 
experimental or the control group. The experimental 
group consisted of two females and six males, with an 
average age of 34 years; the control group consisted of 
one female and seven males, and the average age was 35 
years. For both groups, most of the Ss were married and 


all subjects had previous teaching experience. 


Instrument 

The measuring instrument used in this study was an 
adaptation of Rokeach’s Dogmatism Scale (Form E). 
Form E was specifically designed and validated to assess 
the degree of open- and closed-mindedness in an individ- 
ual (5). Rokeach (7) states that the instrument can be 


used to measure general authoritarianism and intolerance. 


Form E was adapted for use for the present study by 
conducting an item analysis on a pilot sample or 75 Ss in 
the Education Department. Of the original 40 items, 11 
items did not discriminate for the sample and were subse- 
quently dropped from the scale. This procedure yielded 
an adapted scale of 29 items with a odd-even reliability 
coefficient of .85. This was considered sufficient for the 
purpose of the study. 

The Ss were required to indicate their feelings about 
each item on a 6-point forced-choice scale. The choices 


ranged from “I agree very much” (+ 3) to “I disagree very 
much" (- 3). A constant of 4 was added to each item 
score to eliminate negative numbers. A higher score on the 
Dogmatism Scale indicates a stronger dogmatic attitude. 


Procedure 


The Ss in the study were taking basically the same 
course work as they were all approximately at the same 
stage in their graduate work. It was announced in class 
that some students would be chosen to participate in 
small group discussions outside of class. The purpose was 
given as an opportunity to become acquainted with class 
members in another setting. No credit was given for par- 
ticipation, and attendance after the experimental group 
was chosen was voluntary. The experimental group 
attended all other regularly scheduled classes in addition 
to the human relations sessions. The control group 
attended only the regularly scheduled classes. 

The two groups, experimental and control, of eight 
members each responded to the revised Dogmatism Scale 
as a pre-test. The means were tested and the difference 
was found not to be significant, as seen from Table 1. The 
experimental group was exposed to eight 12-hour human 
relations groups meeting over a period of four weeks. The 
leader of the group, a faculty member not previously 
exposed to the Ss, had a Ph.D. in counseling and was a 
trained group leader. 

The human relations group was basically unstructured, 
had no fixed agenda, and stressed the expression of feel- 
ings and ideas experienced within the group and/or out- 
side of it. The topics of discussion ranged from very per- 
sonal problems to academic and on-the-job problems. An 
effort was made by the leader to relate the discussion as 
much as possible to the school administrator's role. The 
topics and the feelings that they generated were held con- 
fidential by the experimental group, thus helping to 
develop trust within the group as well as to keep the inter- 
action between the control and the experimental groups 
ata minimum. 

At the end of four weeks a post-test was administered 
to both the experimental and contro! groups. Tables 1 and 

2 show the results of the correlated and independent t-tests 
that were computed. 


Results 

From Table 1 it can be seen that the experimental 
group and the control group did not significantly differ on 
their pre-test scores. From Table 2 it ean be seen that the 
experimental group's mean post-test score was significantly 
less than its pre-test mean score, whereas the control 
group’s mean post-test score was greater (though not sig- 
nificant) than its pre-test score. Further, it can be seen in 
Table 1 that the experimental group’s post-test mean 
score was significantly lower than the control group's 


post-test mean score. 
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Table 1.-Summary of Independent t-Tests of Control and 
Experimental Groups on Pre- and Post-Tests 


Experimental 
Group 
(N78) 


Control 
Group 
(N=8) 


Independent 
t-value 
(df=14) 


Pre-test x, = 119-13 


7.61 


Post-test 


*Significant at the .01 level 
**Significant at the .05 level 


Table 2.—Summary of Correlated t-Tests of Pre- and Post-Test 
Means for the Experimental and Control Groups 


Á 


Correlated 
t-value 


(df=7) 


Group Pre-test Post-test 


Experimental X = X 
ins X, 71343 x, 811463 
SD,- 7.61 SD,- 548  449* 
Control X, -11913 X = 
xm D a 7120.88 


SD, = 7.00 SD, = 5.74 


*Significant at the .01 level 
**Significant at the .05 level 
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naire it was found that group experiences should become 

a formal part of the academic program in educational 
administration. The participants felt that the human rela- 
tions group provided a vehicle for learning that was not 
present in their regular classroom. Also, the experimental 
group felt that they were able to get to know their class- 
mates in a more intimate way through the group experi- 
ence, and that the experience was meaningful enough that 
they recommended human relations experiences as a reg- 
ular part of the graduate program. It gave them an oppor- 
tunity to develop contacts with their peers in a manner 

not normally Tound in traditional programs. The experience 
also gave the group the opportunity to interact with the 
group leader, a member of the faculty, whose concern 
about their welfare was highly appreciated by the students. 

Whether the changed attitudes of the experimental 
group are permanent and whether these attitudes are con- 
sistent with their actual behavior as school administrators 
are important considerations for further investigation. 
Research should be done with larger groups, and it should 
have a component which assesses actual behaviors in 
schools. 

Recent evidence has been accumulating to show that 
group experiences should become an integral part of the 
training of school administrators. Savage (8) has been a 
strong proponent of this approach. This new emphasis is 


certain to become even stronger in the future than it is 
now. 
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ABSTRACT 


A comparative content analysis of the index terms employed by two different editions of twelve introductory psychology textbooks, 
utilizing principal components analysis with varimax rotation, revealed a definite trend toward more uniformity among the textbooks 
with respect to the terms employed in the later versions than those used in the earlier editions of these texts. The relative prominence 
of various areas of psychology, however, remained about the same in both editions of the selected textbooks. The Spearman tho of .72 
(p <.01) between the number of prominent terms employed in the two editions indicated that the revision did not result in any 
substantial change in the relative status of the texts in regard to the thoroughness of their indices. 


THE PAST TWO DECADES have witnessed the exten- 
sive growth of psychology both as a science and as a 
profession. Similarly, the popularity of psychology courses 
taught at the college level has increased substantially. In 
the undergraduate curriculum, the introductory psychology 
course is probably among the most commonly offered 
courses since it is offered by 92% or more of the universit- 
ies, liberal arts colleges, and junior colleges (14:64). To 
meet this demand, an increased number of textbooks have 
been written for the introductory course, and these 
textbooks have been revised usually with greater frequency 
than textbooks for other psychology courses. 

The contents of these textbooks are subjectively evalu- 
ated by experts and these reviews are published in 
Contemporary Psychology and other appropriate sources 
(1). However, few attempts have been made to analyze 
systematically the contents (c.g., terms employed in the 
tex ts as reflected in the subject indices) in-order to 
determine their similarity or relative comprehensiveness. 

In addition, one should inquire whether revisions of vu 
introductory texts represent any improvement, ti i 
respect to the convergence of the prominent terms 2 
constitute the core of the introductory psychologic 
literature. 

A previous study (23) dealt with the first of wes 
aforementioned questions. The present study s 8 - P 
concerned with the second problem, namely, syste ma 4 ) 
comparing the contents of the two different editions o 


some introductory psychology textbooks. 


Method 


Sample p 

Of the 25 introductory psychology textbooks that the 
authors were able to obtain from the faculty, as well as 
from the publishers, in the fall of 1972, twelve had been 
revised between 1968 and 1972. Depending on availability, 
one of the past editions of each of these books was then 
borrowed from either local libraries or faculty members. 
The terms listed in the subject indices of the two different 
editions of these twelve books (2-13, 15-22, 24, 25, 27, 28) 
were treated as two separate content domains for the pur- 
pose of this analysis. 


Procedure 


The following procedure was followed for each edition: 
First, each text was assigned a two-digit identification 
number. Second, lists were prepared of all terms in the 
main headings of the indices, and the identification 
numbers of the textbooks in which a term appeared were 
noted beside each term. Third, after alphabetizing the 
terms, each term was assigned a four-digit identification 
number and was punched on an IBM card. Each card 
representing a distinct term contained the identification 
number of the term,the scores of 1 or 0 in the subsequent 
columns (depending on whether that term was used in any 
of the twelve books of a particular edition), the total score 
for the term (number of times the term was used), and, 


i 


Mw 
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Table 1.—Correlations among Twelve Books in the Old (above diagonal) and New (below 


diagonal) Editions. 


3 .20 .15 1.00 .10 .25  .20 


4 20 25  .20 1.00  ,30  .15 


5 «40. .30 15 20 1.00  .40 
6 25 30 I5 415 30 1.00 
7 .30 25  .25 25 25  .15 
8 25  .20 .20 .15 .20 .15 
9 .30 15 .10 .15 25 «iS 
10 15. .25 .20 .25 ,10 .15 
1l 25  ,20 .20 _ 220 20  .15 
12 20.20 4.20 .30 .20 .20 


finally, the term itself. Thus, the total number of terms 
was determined (5386 in the old and 6159 in the new 
edition), as well as their relative frequency for each edi- 
tion separately. 

Since many of the terms in each edition were used 
only once in a set of twelve books, it was decided to base 
the analysis only on those terms which were used two or 
more times in the texts in a particular edition. Thus, there 
were 1617 terms which were used twice or more in the old 
edition and 1895 terms used twice or more i 
tion of the same textbooks. The first data m 
tion), therefore, was of 1617 X 12 and the s 
X 12 size. Because of the dichotomous natu 
and the skewed shape of the distributions, t 
relations were computed to represent correl 
the books for each edition separately. 
correlations for the old edition (above diagonal) and new 
edition (below diagonal) textbooks. The two 12 X 12 
correlation matrices (one for the old and the other for 
the new edition) were analyzed by means of the principal 
components method, using unities in the diagonal. Factors 
whose latent roots were not less than .80 Were retained and 
rotated by the varimax routine, resulting in six principal 
components for each edition.! A second-order factor 
analysis, using the intercorrelation of six factors obtained 
through promax rotation (a method whic! 


h yields oblique 
simple structure) was also conducted for the two editions 
separately. The methods of analysis and rotation were the 


same as before, but the retention of a fac 
its having a latent root of 1.00 or larger. 


n the new edi- 
atrix (old edi- 
econd, of 1895 
re of the scores 
etrachoric cor- 
ations among 
Table 1 presents the 


tor depended on 


7 8 9 10 11 12 

20 30  .10 15 15 20 
«25 +40 25 35 .20 .20 
10 10 .10 10 20 10 


35 35 .30  .40 .35 .30 
30 25 .10  .10 .20 .10 
1.00 .30 20  .05 20 .20 
+35 1.00 15 .20 15 15 
.15 15 1.00 .35 .20 20 


30 .20  .25 .20 1.00 


30 .20 .15  .25 .20 


The number of most promine 
more of the books in each sampl 


order to compare the relative thoroughness of the two 
editions. Spearman's rank order coefficient was then 
computed between the number of promine 
in the old and those used in the new editio 
determine the stability of the relative pro 
books from one edition to another. 
Finally, to determine on a rather broad scale whether 
the subject matter of introductory psychology changed 
qualitatively over a period of about ten years (the average 
time lag between the two selected editions), four broad 
areas were somewhat arbitrarily designated to cover the 
Spectrum of psychological material: (a) psy chometrics and 
statistics; (b) learning and physiological; (c) personality 
and psychopathology; and (d) social and developmental. 
By classifying the total number of most prominent terms 
from each sample (edition) into these four areas and then 
comparing the results of old and new editions, the authors 
hoped to secure a reasonably realistic impression of the 
change in emphasis on various sub fie! 


reflected in the two diffe; 
books. 


nt terms (uséd in six or 
e) was also determined in 


nt terms-used 
n in order to 
minence of the 


lds of psychology as 
rent editions of these twelve 


Results and Discussion 


Factor Analysis of Textbooks 


In accordance with the criterion of retention mentioned 
Previously, six factor: 


m s were retained and rotated for each 
edition Separately. The factor loadings after rotation of the 
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twelve books in the old edition are presented in Table 2 


ile Table £ d loadings (+ .2. 
while Table 3 presents analogous results for the new ings (£ .25 or larger) on at least two but not more 


than five of the twelve textbooks. In the second analysis 


edition. The factors as a wh 
1 s as a whole account for 69.7% 
= variance for the old edition (Table 2) and e Ped e me vi ee qure. Ma 
en p speci si i i 
rots bre the new edition (Table 3). The hie d d ise cut | 
percentage of variance accounted for by various factors T stipe di 
Wl : y. H : : ity 

"— indicates that no single factor holds any "nh Gils 2 vemm TA coe y roast 

ssh M position in either analysis (the percentage of of congruence Wore com id ^ rie sme e 

e "Pn between 9.2 and 13.7 in Table 2 and pairs (e.g., Factor A in Table 2 E -— xn Nod 

: cum s and 14.5 in Table 3). All of the factors in the etc.). These coefficients for Fa LEARN 

ine analysis (old edition in Table 2) can be justifiably F were .69, .63, .84 ; 12, .69 ee cie fe mm 

designated as group factors since they have substantial there seems to be serait betw ere Mp 
s arity between the factors, but 


Table 2.—Factor Loadings after Varimax Rotation of the T: 
(Old Edition) welve Books 


Factors 


Edwards (1968) 

Hebb (1958) 
Hilgard (1957) 
Kendler (1963) 
Kimble (1956) 
Krech & Crutchfield (1958 
Lindgren & Byrne (1961) 
McKeachie & Doyle (1966) 
Morgan (1956) 
Munn (1956) 
Ruch (1958) 

la Whittaker (1965) 
Percent of variance 


68 -02 -10 .26 .36 -01 .66 
32 11 -03 .72 04 .06 .64 
49: 73 303  :02 as 04 59 
405 .82 410 21 -02 .10 .74 
35 .23 .32 =.22 26 43 ..57 
05 19 05 .09 86 .09 .80 
3.5 13.7 10.6 13.0 92 9.7 69.7 


Table 3.—Factor Loadings after Varimax Rotation of the Twelve Books (New Edition) 


Factors 


Edwards (1972) 

Hebb (1970) 

Hilgard, Atkinson, & Atkinson 
Kendler (1968) 


Kimble & Farmezy (1968) 
Krech, Crutchfield, & Livson (1969) 76 04 16 -05 06 16 64 


& Petrinovich (1968) 40  .14 .12 70 29 09 - 


a971) | 13 .07 .94 14 13 09 95 


Lindgren, Byrne, 
McKeachie & Doyle (197 D 


Morgan & King (197 D) 47  .83 -03 -03 2 -06 74 
& Fernald (1972) 05 08 10 06 22 82 74 


-.01 62 16 27 02 37 62 
12 04 08 20 36 06 64 
14.5 113 83 121 115 95 674 


Munn, Fernald, 
Ruch & Zimbardo (1971) 
Whittaker (1970) 


Percent of variance 
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it is generally quite negligible except in the case of Factor 
C. where the coefficient of congruence is not interpreted 
as a correlation coefficient and usually has to be higher than 
.80 to indicate even a semblance of similarity between two 
factors. If the indices of these books, in the two selected 
editions, are fair indicators of the corresponding termino- 
logical contents, then the two editions would not be given 
a high rating on factorial congruence. However, it does not 
mean that the terms used in the index of the old edition of 
a book are different from those in the new edition of the 
same book, but it does indicate lack of stability between 
the two editions in the content areas represented by 
certain subgroups of texts. 

The second-order analysis, based on six factors obtained 
through promax (oblique simple structure) solution and 
employing the principal components method with varimax 
rotation, resulted in two second-order factors for the data 
of the old edition and only one second-order factor 
for the new edition—the criterion for retention of a factor in 
both cases of the second-order analysis was having a 
latent root of 1.00 or larger. Table 4 embodies Factors 1 
and 2 based on the old edition and one factor based on the 
texts in the new edition. While Factors 1 and 2 together 
account for 46.6% of the variance, the single factor emerging 
in the data of the new edition accounts for 39.5% of the 
variance. Thus, it seems that the new edition, in contradis- 
tinction to the old, evidences a greater degree of conver- 
gence and uniformity in the contents of the indices of the 

selected books than does the old edition. The books in 
the old edition do not seem to have as much common 
material among themselves as do their counterparts in the 
revised version. Over the years, therefore, authors have 
tended to employ a greater proportion of common terms 
in the indices of introductory texts. 


Comparison of the Degree of Prominence 


The degree of prominence of a term, in either edition, 
was defined as the inclusion of a term in the indices of 
six or more books. The twelve books in the old edition 
ranged from a low of 115 terms, used by Hebb (4), toa 
high of 259 terms, used by both Morgan (19) and Whittaker 
(27). The total number of prominent terms in the old 
edition was 305, compared with 368 in the new edition. The 
mean number of prominent terms in the old edition was 
200.3, with a standard deviation of 51.5. The range for 
prominent terms in the new edition was from 150 by Hebb 
(5) to 295 by Hilgard, Atkinson, and Atkinson (7), which 
text included 255 prominent terms in its 1957 edition. 
The average increase in the number of terms in the new 
edition was 48.2, indicating that the later edition of these 
textbooks was much more comprehensive than the earlier 
one. Spearman’s rho between the numbers of prominent 
terms used in the two editions of these twelve books turned 
out to be .72 (p < .01), indicating a substantial degree of 
stability in the relative prominence of the indices across 
two editions. 


Table 4.—Results of the Second-Order Factor Analysis for the 
Old and New Editions 


Old and New Editions (0. 0 — 


Factors ( (Old Edition) 
1 2 


Factor (New Edition) 


Variables 1 


Percent of 
Variance : . 39.5 


Emphasis on Various Areas of Psychology 


Classification by two research assistants of the prominent 
terms (305 in the old and 368 in the new edition) into four 
broad areas resulted in the following percentage-wise dis- 
tribution: (a) psychometrics and statisties, 14% in both the 
old and new editions; (b) learning and physiological, 48% 
in the old and 5256 in the new; (c) personality and psycho- 
pathology, 23% in the old and 19% in the new; and (d) 
social and developmental, 15% of the total terms in both 
editions. Thus, the relative emphasis given to the afore- 
mentioned areas was about the same in both editions, 
although the total number of prominent terms was much 
larger in the new edition of the books. In spite of the 
subjective character of the foregoing procedure, the results 
seem to conform to those of the first-order principal 
components analysis in which the same number of factors 
emerged in the data of both editions. 


Evaluation and Conclusion 


It is recognized that the present findings about the 
contents of these textbooks are valid to the extent to which 
the indices, in either edition, accurately represent the con- 
tents and organization of the twelve selected books. Some 
of the authors apparently devoted considerable care to the 
preparation of indices, while others did not. Also, the time 
gap between the two editions was not the same for all 
texts, especially since in some cases the authors were 
unable to secure the immediate past edition and in other 
cases the texts had gone through no more than two 
revisions. In general, these conclusions seem to make 
sense in the light of what is generally known about the 
quality of these texts and their revisions (1). 


NOTE 


1. Although a number of authorities on factor analysis 
recommend a latent root of 1.00 or larger for retaining a factor 


QUERESHI AND ZULLI 39 


for rotation, .80 was selected as the cutoff point for first-order 
analysis in this study because it permitted the accounting of about 


70% of the variance by the components for the data of each edition. 


Had 1.00 been used as the cutoff instead of -80, the authors 
would have ended up with about 40% of the variance and with 
three, instead of six, components extracted in each case. 

On the other hand, changing the .80 criterion to a lower 
figure would have resulted in the retention of a number of 
additional components whose variance contributions were 
minimal. 


10. 
l1. 
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' COLLEGE GPA AS A PREDICTOR OF 


' AN OLD QUESTION 


TERRY L. JAMES 
Westmar College 
Le Mars, Iowa 


RECENT TRENDS IN TEACHER education have 
presented institutions with a perplexing paradox. 
The companion problems of teacher surplus and high 
enrollments in teacher education have brought about 
widespread institutional efforts toward more selective 
admission of primarily low-risk candidates to teacher edu- 
cation programs. Based upon five decades of teacher effec. 
tiveness studies (2-5) which lend modest support for col- 
lege grade point average (GPA) as a predictor of teaching 
success, that criterion seems to be the one most frequently 
employed in selective admissions today (1). 

Simultaneously, recent trends 


toward competency-based 
teacher education have increased 


Pressures to focus 
ctive teachers’ mastery of specific 
completion competencies rather than upon the traditional, 
generalized global ratings of effectiveness used in previous 
studies, Proponents of competency-based criteria for 
teacher success often lead the chorus of criticism of the 
GPA basis for admissions, charging th 
criteria were employed, probably no i 
be found between college GPAs and success 
The present study tests the validity of this a 
A second assumption tested by this study is that 
the significant, though not high, positive correlations 
found in earlier studies between college GPA and Measures 
of teaching effective 


as a teacher, 
ssumption, 


ness comprise a linear or uniform 
relationship. Many earlier studies operated statistical] 
upon this assumption, concluding, in effect, that if know]. 


edge of college GPA tells you a little bit about aC student's 
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potential for success, it would tell you as much about 

à B student's potential and an A student's potential. How- 
ever, it seemed probable to the authors that the relationship 
is far from uniform, and that while GPA might tell one a 
great deal about Prospects for success by a 2.00 student, 


it might tell virtually nothing about the prosepcts for a stu- 
dent with a 2.80 average. 


This study, then, w 


as designed to test the following 
two hypotheses: 


1. The statistical relationship between academic success. 
as measured by the cumulative college GPA, and 
Success in student teaching, as measured by the six 
teacher effectiveness ratings used in this study, is 
accounted for by those people in the lower GPA 
Categories. Figure | presents the graphic represen- 
tation of this hypothesis. 

- Four competency ratin 


£s employed in this study will 
be related to college G 


PA in a pattern similar to that 
of the two global ratings of effectiveness. All patterns; 
in effect, will reflect Figure 1, 


Procedure 


Instrumentation and Data Collection 


Two types of student te 
were used: global rati 
Categories, 

Global Ratings: Ty 


were the Cooperating 


acher effectiveness ratings 
ngs and competency ratings in four 


vo global ratings were used. These 
teacher's Recommended Grade in 
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.00 


Figure 1.—Hypothesized relationship between academic success 
and student teaching effectiveness with a stepwise elimination of 
lower GPA categories from a hypothetical sample 


Student Teaching (RGST) and a Personal Impact Rating 
(PIR): The PIR utilized two hypothetical situations 
designed to commit the cooperating teacher in a very 
personal way. The first situation was concerned with the 
type of recommendation the cooperating teacher would 
give if the student teacher were being considered for a 
job in his/her school. Choices ranged from “cannot 
recommend” to “outstanding prospect, hire if available.” 
The second situation was concerned with how strongly the 
cooperating teacher would recommend the enrollment of 
his/her child in a class to be taught by the former student 
teacher. The cooperating teacher could choose between the 
former student teacher and another beginning teacher 
about whom nothing was known, the rationale being 

that the beginning teacher about whom nothing is known 
would constitute the central or neutral position between 
positive and negative poles. Choices ranged from 
“definitely enroll with the former student teacher” to 
“definitely enroll with the other beginning teacher.” 

The scores from the two situations were combined to 
produce the PIR rating for each student teacher. 

The PIR test-retest reliability coefficient was .92, 
the highest reliability obtained for the criterion 
instruments used. 

Teacher Competency Ratings: Four specific teacher 
roles or competency categories were identified, each 
considering the teacher as: (CR1) a stimulator; (CR2) 

a presenter; (CR3) an organizer; (CR4) a synthesizer. 
Each competency area was divided into two specific 
behavior components. A five-point continuum was used 
with descriptors at three points. Scores on the we sub- 
parts were combined for the rating on the particular role 
or competency. 

Theitextaatoal reliability coefficients ranged from .71 
to .91. 

Cumulative College Grade Point Averages: The . : 
cumulative college GPA was taken with the completion of 
the fifth semester or a minimum of 66 semester hours. 


Official student transcripts were used. 


E 


Subjects 


Participants in this study were 170 secondary mee 
teachers who had completed all of their work through the 


University of Missouri-Columbia. Student teaching was 
started and completed during either the fall semester 1972, 
or the first block, winter semester 1973. 

The three subject areas of special education, vocational 
home economics, and vocational agriculture were 
excluded. 


Analysis of Data 


The 170 Ss were categorized into seventeen different 
levels as determined by their cumulative GPA at the 
completion of semester five. GPA at the University of 
Missouri-Columbia is computed on a four-point system: 
C=2.00, B=3.00, A-4.00. These seventeen categorical 


levels were: 


L1-all students 
L2=2.00 and above 


L10=2.80 and above 
L11=2.90 and above 


L3=2.10 and above 
L4=2.20 and above 
L5=2.30 and above 
L6=2.40 and above 
L7=2.50 and above 
L8=2.60 and above 


L12=3.00 and above 
L13=3.10 and above 
L14=3.20 and above 
L15=3.30 and above 
L16=3.40 and above 


L17=3.50 and above. 


-70 and above 


A Pearson r was computed between each criterion 
variable and each GPA category. This was accomplished 
Tor each variable through the use of a stepwise elimination 
procedure progressing from L1 to L17. 


Results 


As was anticipated from previous studies (2-4), corre- 
lation coefficients significant at the .001 level were found 
for the full sample between the cumulative GPA and both 
of the global ratings of student teacher effectiveness. 
Using the Recommended Grade in Student Teaching 
(RGST) as a criterion, the resulting coefficient was .29. 
Using the Personal Impact Rating (PIR), the result was 
.36. The stepwise elimination procedure previously 
described was then employed, with the results graphically 
portrayed in Figure 2. 

As student teachers with lower GPAs were system- 
atically eliminated from the sample, the relationship 
between GPA and each of the two global 
criterion variables of student teacher effectiveness was 
rapidly diminished to statistical insignificance. Using 
the recommended grade criterion, the GPA became a 
statistically insignificant predictor of success (.05 level) 
when all students with GPAs below 2.30 were eliminated 

from the sample. Using the PIR as a criterion, GPA 
continued to serve as a statistically significant predictor 
of success at the .05 level until all students with GPAs 
below 2.90 had been eliminated from the sample. 

A similar pattern emerges when the elimination 

procedure is employed with the four partial measures 


JOURNAL OF EXPERIMENTAL EDUCATION 


n* 
2 
170 163 154 143 135 124 112 107 96 87 82 70 62 55 45 38 8 


-44 T 


-42 
-40 
.38 


Ll L2 L3 I4 L5 6 L7 I8 19 L10 Lll Ll2 L3 Ll4 Ll5 Ll6  L17 


GPA CATEGORIES 


* Subsample size after elimination of each GPA category 


Figure 2.—Relationship between two global measures of student 
teaching effectiveness and cumulative GPA at semester five with a 
stepwise elimination of lower GPA categories from the sample 
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of effectiveness focusing upon specific competencies or 
roles of teachers (CR 1-4). Figure 3 demonstrates that all 
four measures of competency were initially correlated with 
GPA at the .05 level of significance, with correlations 
ranging from .36 (teacher as presenter) to .17 (teacher 
as synthesizer). GPA becomes a statistically insignificant 
predictor of all four criteria if all students with GPAs 
below 2.40 are excluded from the sample. 

Based upon these findings, the authors are inclined 
to accept the first hypothesis, as graphically presented 
in Figure 1. In effect, GPA does indeed give some predic- 
tion of success or failure in teaching, regardless of the type 
of criterion one uses for success. But beyond a certain 
minimal level of demonstrated academic competence 
(approximately a 2.50 grade point level at the University 
of Missouri), GPA tells little if anything about a student’s 
potential as a teacher. Apparently, other variables become 
more important. 

An interesting and entirely unexpected qualification 
must be noted to the above inference. In both Figures 
2 and 3, it will be observed that as the middle GPA 
group is also eliminated from the sample, all lines turn 
sharply upward again and five of the six criterion variables 
once again are correlated with the college GPA variable at 
levels exceeding the requirements for .05 significance, even 
with the considerably reduced sample size. The sixth 
variable, CR2 (teacher as presenter), follows the same 
pattern as the others but falls short of significance at the 
.05 level. Clearly, with this sample, the students who 
were highly talented academically were markedly more 
successful as student teachers than the upper-middle group. 
Finally, visual comparison of Figures 2 and 3 indicates that 
the second hypothesis is also correct. The pattern of rela- 
tionship with GPA is essentially the same for all six criteria 


employed. 


Discussion 


The results of this study indicate that the use of college 
GPA as a selective admission criterion for teacher educa- 


tion may be useful and appropriate if used judiciously in 
combination with other variables. There is a positive rela- 
tionship between GPA and both global and competency 
ratings of student teachers, but much of this relationship, 
with both types of criteria, is explained by the poor show- 
ing of very low GPA students in student teaching. The 
point at which the GPA becomes virtually worthless in 
selective admission would probably vary with the grading 
practices at various institutions. At the University of Mis- 
souri a grade point requirement exceeding 2.40 or 2.50 
would probably serve no useful purpose in improving prod- 
uct quality. 


NOTE 


1. The Personal Impact Rating instrument was constructed 
Terry L. James. 
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ABSTRACT 


In this study an attempt was made to provide an overview of applicable evaluation procedures which can be utilized by educational 
decision-makers responsible for the arduous task of teacher accountability. Specifically described are three different structures (models) 
of evaluation: systems; benefit-cost analysis; and experimental design. A discussion follows describing the types of measures to be applied 
and includes a review of the necessary teacher-student relationship under evaluation conditions. 


HISTORICALLY, EDUCATIONAL ADMINISTRA- | 
TORS have blamed inadequate resources, bureaucratic rule, 
political constraints, and a plethora of uncontrollable 
variables for their agencies’ failure to attain educational 
goals. The returns for increasing investments in education 
have apparently been below the levels the consumer (the 
tax-paying public) has been led to expect. As a result, 
educational processes and outcomes are under critical 
investigation by the consumer, and the once authoritarian 
and learned work of the professional educator is regarded 
with unusual suspicion. In an effort to defend the utiliza- 


tion of dwindling resources, administrators have been 


searching for ways and means which would offer the 
nec 


essary structure to permit systematic analysis and, 


it is hoped, solutions to their problems. 

The problems which educational administrators con- 
tinued to perceive as being most critical to the attain- 
ment of educational goals are fiscal in nature, i.e., addi- 
tional personnel, experimental programs, federal projects, 
etc. Subsequently, educators have attempted to integrate 
planning-programming-budgeting Systems into their edu- 
cational systems for purposes of accurately dealing with 
such fiscal concerns. There have been many attempts 
to interpret the educational financial structure. Some 
approaches have been based on apparently “good” edu- 
cational philosophies, while others are attempts to take 
concepts from successful business models. From the 
industrial world Leon Lessinger, past director of the 
Department of Health, Education and Welfare, introduced 
the concept of "educational program audit" based on the 
role of the certified public accountant; the acronym used 
in education is EPA (Educational Program Auditor). 

In education an administrator's plight is inherently 
dependent upon the ability to motivate, guide, and 
evaluate instructional staff and their operational processes. 


Teacher evaluation has long been a problem confronting 
administrators, subsequently impeding any form of 
accurate accountability in the public education domain. To 
complicate matters, the inferred relationship concerning 
appropriate and efficient teaching practices must deal 
with a very important variable, the student. And when 
one wishes to investigate the relationship between teacher 
and student, a multidimensional task is inevitable. Con- 
comitantly , one must take into account such variables as 
school-related student attributes, non-school student- 
related attributes, program and service variables, student 
performance variables, post-school adjustment variables, 
and many, many more. 

Given this set of critical and complex circumstances, 
the administrator is required to implement a sound, scien- 
lifically based instructional evaluation system, comple- 
mented by an equally appropriate data collecting and 
monitoring System. In a majority of cases, expert outside 
"tance Is necessary for at least evaluation design 
purposes. However, due to lack of finances, the responsi- 
bility of designing such an instructional evaluation system 
is often routinely delegated to program administrators. 

In many cases these administrators begin to feel literally 
trapped. The “trapped administrators,” according to 


Campbell (5), are those who “have so committed themselves 
to the efficacy of the reform that they cannot affort honest 


evaluation.” A contrast is made when those administra- 
tors who “initially justified the need for reform on the 
basis of importance of the problem, not the certainty of 
their answer, are committed to going on to other potential 
solutions if the one first tried fails” (5). ("Reform ^ here is 
used by Campbell to indicate a commitment to a new 
program.) Because of these facts the educational adminis- 
trator creates a system that forces a high probability of 
making a variety of instructionally inappropriate program 
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choices. Examples of these poor decisions are found in 
curriculum innovations, teaching methods, staffing models, 
and pupil management approaches. 

In an attempt to develop a systematic procedure for 
dealing with the task of making appropriate and efficient 
decisions, the following presentation will suggest certain 
means of carrying out teacher evaluation and effectively 
utilizing the data produced. 

Methods employed in the development of an instruc- 
tional evaluation system (evaluation of teachers) can be 
generalized to any personnel evaluation problem. Teachers 
were chosen as subjects of this paper on evaluation 
because: (a) their actions are often considered more visable 
than the administrator and, therefore, subject more often 
to criticism; (b) many administrative polici s are initiated 
through teachers; and (c) traditionally administrators are 
responsible for the process of teacher evaluation. Few 
texts in educational supervision and administration provide 
extensive discussion of teacher evaluation methods; how- 
ever, the importance of teacher evaluation is invariably 
mentioned. Lindley (18) states that "evaluation of teach- 
ing is not only desirable but quite necessary and even 
inevitable." Others tend to support this statement. 


Teacher Evaluation 


A good deal of varied opinion exists regarding the form 
teacher evaluation should take. Gage (14) recommends 
that the complex behavior of teaching be broken down to 
measure "micro-effectiveness." He says: 


Rather than seek criteria for overall effectiveness of teachers 
in the many, varied facets of their roles, we may have better 
success with criteria of effectiveness in small, specifically 
defined aspects of the role; if such laws could be developed, 
they might eventually be combined . . . to account for the 
actual behavior and effectiveness of teachers with pupils 
under genuine classroom conditions. 


Saadch (22) states that: 


Viewing the whole phenomenon of teaching effectiveness 
in terms of its parts seems to ignore the necessity of treat- 
ing the teacher—learning act as a totality. 


Evaluation Designs 
to discuss the subject of 


An attempt will be made 
that the only 


teacher evaluation from the perspective . 
ion i : i a upon 

purpose for evaluation is to supply information Į 

priate and efficient instructional- 


which more appro : | l 
a from evaluative proc- 


based decisions can be based. Data d 
esses may be applied to decisions in two ways. The firs ; 
is to indicate that a decision should be made. The second 
3). Both may be 


is to support decisions already made (2 
t-hoc or 


found in evaluation designs that are either pos 
prearranged. Figure | compares the two designs in terms of 
the sequences of program events. In the post-hoc 

example, data are collected, evaluation designed, data 


analyzed, and decisions made. In the prearranged situa- 
tion, evaluation is designed, program initiated, data 
collected and analyzed, and the decision is made. 

In the post-hoc situation, data are gathered with 
unavoidable bias. Unless the data collection process takes 
place under a prearranged design, “the study must be 
classified as a false experiment” (4). Examples of post-hoc 
designs can be found in the Westinghouse, Ohio (26) study 
of Headstart and the Dentler (10) review of the More 
Effective Schools program in New York. 

Contrasting the post-hoc approach, prearranged 
designs of data collection require that the educational 
administrator has anticipated a decision point or situation 
and has designed the evaluation procedure to facilitate an 
enlightened decision based on the appropriate data. 

The forces or motives dictating the types of decisions 
determine, to a large extent, the form the evaluation is to 
take. The resultant data then give evidence of the action 
the administrators should take and enable the administra- 
tor to facilitate this action after the decision. 

In order to provide educational decision-makers with 
a basic understanding of appropriate evaluation design con- 
siderations, an illustration and discussion of various struc- 
tures (forms), with accompanying measurement techniques, 
in which teacher evaluation can take place will follow. 
"valuation structures to be discussed include a system 's 
approach to evaluation, a benefit-cost analysis approach, 
and an experimental design approach. Although these three 
approaches far from represent the totality of procedures 
available within the evaluation domain, they do appropri- 
ately serve as viable alternatives within the arena of 
teacher evaluation. 


Structures of Evaluation 


The public school administrator has traditionally been 
responsible for processes of planning, organizing, allocat- 
ing resources, staffing, coordinating, controlling, and 
evaluating. Too often administrative structures evolve 
into mechanisms to facilitate the functioning of these 
processes individually. A balanced budget or complete 
staff, however, does not necessarily imply that educa- 
tional goals are being met. Information derived from the 


analysis of individual processes cannot be used to infer 


institutional approximation of educational goals. In order 


Post-Hoc Design 


Program Data Evaluation Decision 
Lo nas 
Process Collection” Design 
Prearranged Design 
Evaluation Program Data Decision 


ý EM e 
Design Processes Collection 


Figure 1.— Comparison of post-hoc and prearranged designs 
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Figure 2.-A model educational system 


to determine goal approximation with some reliability, the 
administrative processes must be superceded by a system 
(17). It is also true that goal achievement or teaching 
processes of teachers are best analyzed within the context 
of the total educational system rather than the individual 
teaching situation. 


Systems 


The evaluation of a teacher and the interpretation of 
information resulting from such an evaluation require a 
great amount of time and effort on the part of those 
involved. Usually the school administrator goes through 
a process of determining the outcome of an evaluation 
by his perception of the data before and after collection. 
The evaluation process is, no matter how sophisticated, 
limited in usefulness when not conducted and described 
as an integral part of a system. Without the structure of a 
system, the administrator is not entirely able to visualize 
the impact of the information derived from the teacher 
evaluation process. As a result, the administrator may 
react inappropriately toward a teacher’s behavior or 
underestimate the value of teacher evaluation. 

Figure 2 illustrates a system in terms of levels, flows, 
and decision points (13). The system of Figure 2 repre- 
sents a student population that has been evaluated as 
having minimal skills for entry into a specific educational 
program such as high school physics. The evaluation 
processes serve the decision points as well as provide 
information to the administrator on the effectiveness of 
the system. To assess the teachers’ effectiveness in this 
system, the evaluator must consider the nature of the 
system itself, which basically includes the following var- 
iables: student entry skills; variance of skills among 
students; total number of students; skills of the teachers; 
and the school’s financial investment. 


Benefit-Cost Analysis 


A number of the major decisions made by administra- 
tors involve the allocation of funds. And “the relation- 
ships between application of resources to a particular 
program and attainment of objectives can be determined 
by benefit-cost analysis” (9). Benefit-cost analysis has 
been defined as the ratio of the present value of future 
benefits to the present value of future costs. In a decision- 
making situation, one must take into consideration the 


Educational 


Process 
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fact that present resources are always more valuable than 
future resources. The school administrator evaluating 
teacher performance does so usually with present costs 
paramount in mind. 

Cost and benefit of vocational programs may be 
determined through wages earned and taxes paid by past 
students. Cost and benefit analysis in an elementary school 
might be determined on the basis of program comparisons 
and, therefore, relate teacher evaluation to program cost. 
Such a relationship permits a more valid evaluation of 
teachers by weighting programs according to resources. 

A program receiving a great deal of support should show 
more benefit than one with a smaller allocation of 
resources—all other variables being equal. A teacher 
appearing to have greater results with a small resource 
allocation will usually receive a higher commendation than 
a teacher receiving the same results with a larger allocation 
of resources. 

By evaluating a teacher on the basis of benefit-cost | 
analysis, one may determine that the achievement (or lack 
of achievement) of objectives may be costing more through 
one program than another or through one teacher than 
another. However, cost does not necessarily represent 
material considerations but may be studied in terms of 
time required to complete an objective (time of educators 
or students) or the total number of individuals involved 
(number of educators or students). Consequently, the 
benefit-cost analysis approach requires that a quantity 
(variable) be designated and the appropriateness and 
efficiency of the use of the quantity be analyzed and 
related to the program (objectives). The effects of the 
variable on the program objectives are determined 
through the most appropriate evaluation or experimental 


designs. 
Experimental Design 


In order to provide structure to the procedure of 
teacher evaluation, experimental design considerations 
are essential. The designs are presented in order of scientific 
experimental strengths—weakest to strongest. The first 
four are excerpted from Experimental and Quasi Experi- 
mental Designs for Research by Campbell and Stanley (6)- 
l. Interrupted Time-Seric. 


4 ə Design —n this setting the 
comparison base is the 


record of previous years. 
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The usual mode of application is a casual version of 
a quasi-experimental design, the one-group pre-test/ 
post-test design. This is used where no control group 
is possible. 

The interrupted time-series design utilizes previous 
years or months of performance of a teacher or group of 
teachers by plotting identified performance against the 
selected criterion. This design ignores all the variables and 
their variance over the time in question. It does, however, 


generate questions and therefore other studies and theories. 


* * # ow o 


2. Control Series Design —The most common of such 
designs is the non-equivalent control group pre-test/ 
post-test design, in which for each of two represen- 
tative groups, pre-test and post-test measures are 
taken and one receives the treatment. 

The purpose for this design is to analyze the variance in 
group performance. When applied to teacher evaluations, 
this design may be utilized for students or groups of 
teachers who are selected in a representative manner. 


* + o * & o 


3. Regression Discontinuity Design —If randomization 
is not politically feasible or morally justifiable in a 
given setting, there is a powerful quasi-experimental 
design available that allows the scarce good to be 
given to the most needy or the most deserving. All 
it requires is strict and orderly attention to the 
priority dimension. 
In a situation where the introduction of a program 
or teaching method is being piloted on a group of children 
or teachers, initiation may be by those who fall above or 
below some decision point on an achievement measure, 
rating, or performance continuum, For example, if an 
administrator wishes to pilot the use of an innovative 
teaching technique under optimal conditions, he may 
rate teachers and/or students according to some criterion 


and choose those who fall above a criterion “cut-off” point. 


After an appropriate predetermined period of time, the 
criterion is again applied and the administrator takes note 
of the difference (if any) in rating at the decision point. 


* x oX * o * 


4. Randomized Control Group Experiments -Random- 
ized experiments tend to be limited to the laboratory 
and/or agricultural experiment stations. But this 
certainly need not be so. The randomized population 
may be persons, families, precincts, or administrative 
units. 


This experimental design is sometimes called the true 
experimental design. Although large groups are desirable, 
representative selection is possible within a single school, 
depending on the evaluation design and the level of con- 


fidence (significance) the administrator believes is neces- 
sary for a decision. 


* X * * X 


a 


- Single Subject Design—The use of replication with 
comparisions of an individual’s behavior rate changes 
before and after experimental intervention. Because 
the experiment is not dependent on numbers, inter- 
vening variables may be controlled more accurately 
than in any of the other designs. The subject becomes 
his own control. 


Single subject research designs are not based in statistical 
generality, and subsequently are very useful in individual 
teacher evaluation. As Sidman (24) says: 


Once the administrator has pointed out those features of 
teacher performance with which he is particularly concerned, 
... direct replication of the teaching activities may be 
accomplished either by performing the experiment again with 
new subjects or by making repeated observations on the 

same subjects under each of several evaluation conditions. 


* X o* & ok 


6. Multivariate Analysis—Given the multidimensional 
task of having to concomitantly take into account 
such variables as student attributes, non-school 
environmental variables, program and service variables, 
student performance variables, and post-school 
adjustment variables necessitates utilization of a 
multidimensional approach termed “multivariate analysi 
(28). Multiple linear regression analysis, a form of multi. 
variate analysis, was selected by Sommers and Joiner 
(27) for purposes of conducting research when a variety 
of behaviorally oriented variables were being investigate 
in a study of the “disadvantaged.” They state: 


It is assumed that performance or behavior is subject to the 
influence of more than one variable or condition at a time 
and that adequate explanations involve more than a single 
variable or condition, But, if several variables are proposed 
as being relevant to performance, it becomes necessary to 
measure both the influence of the variables on the behavior 
we are attempting to explain and their influence upon each 
other. 


Multivariate analysis allows the evaluator to reflect 
complexities in the evaluation paradigm. The power of 
prediction as an intellectual tool resides in the fact that 
it enables one to rigorously test the adequacy of various 
theoretical evaluation models that might be proposed. 

The various designs were described in order to suggest 
that techniques of teacher evaluation should have as their 
basis an acceptable and appropriate evaluation design with 
pre-set standards rather than some after-the-fact rationale. 
In each of the designs, the variables must be defined in a 
quantifiable format and specifically relatable for purposes 
of administrative decision-making. 
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Evaluation Measures 


Cognition, Affect, and Performance 


Almost all measures applied to the domain of teacher 
evaluation attempt to describe content areas of cognition, 
affect, and/or performance. In the cognitive realm, mental 
ability, knowledge of subject matter, educational level, 
and verbal ability are common areas of measurement. 
Although some investigators (16) have shown that the 
intelligence of teachers is highly correlated with student 
achievement, others (15) have demonstrated that little 
is known concerning the relationship between cognitive 
abilities of teachers and variables commonly associated 

with student achievement. 

Similar complexities also exist in the relationship of stu- 
dent accomplishment and the teacher's command of sub- 
ject matter. Generally, knowledge of subject matter is con- 
sidered essential to teaching; however, when one considers 

‘the goals usually set for the carly years of elementary 
education (reading, spelling, arithmetic, ete.), knowledge 
of subject matter becomes less important than such 
categories of skills as teaching methods, curriculum 
design, behavior management, etc. It has been documented 
that sufficient ability exists to equip a teacher with infor- 
mation related to subject matter (history, chemistry, 
mathematics, etc.). Consequently, educators should turn 
their efforts to the more difficult task of training the 
teacher to teach. Verbal ability of a teacher, usually 
demonstrated on a performance level, has been shown to 
have high correlations with pupil achievement (8). 

In contrast, a teacher's educational training is often con- 
sidered as an inaccurate indicator of teacher effective- 
ness (11). 

Affect (i.e., attitudes, interest, sense of humor, etc.) 
is usually defined by the evaluator and/or instrument that 


measures it (7). Difficulty exists in obtaining valid measures 
of affect, and teacher evaluation based on affective 
measures may result in information in conflict with data 
derived from measures having their bases in more concrete 
and less subjective areas. 

Complexity also exists in relation to the validity of 
performance-based measures, Barro (2) says that “no 
program of performance measure alone, no matter how 
comprehensive or sophisticated, is sufficient to establish 
accountability.” One of the major problems in the use of 
performance measures is the divergent definition of 
performance. The Skinnerian (25) definition, involvin 
counting and recording only directly observable behavior, 
imposes rigid constraints on the evaluation procedure in per- 
formance areas. However, such constraint provides 
extremely reliable and valid data on which to base decisions, 

The four most common measures used in the evaluation 
of teachers are ratings, achievement score: 


categorization, 
and event counting. Each of these may be applied by an 
observer, student. and/or teacher. Table 1 illustrates the 


various measures and their applicability to each of these 
° The table shows that the use of 
observers occurs more often than the use of students: ratings 


three groups of “users. 


are shown to be used more often than the other scales. 


Table 1.—Teacher Evaluation Measures and Users 


Ratings Achievement Categories Event Counting 
Observer 1 1 1 1 
Student l 0 0 
Teacher 1 0 i 


(self) 


1—indicates use of a scale 
0—indicates little or no use of a scale 


Rating Scales 


Ratings and rating scales are the most commonly used 
measu 


s. Ralunowitz and Travers (20) suggest a good 
rating scale should: (a) define with precision several points 
on each scale: (b) restrict each scale to well-defined and 
observable behavior; (c) vary the end of the scale (where 
several are used) which represents “good”; and (d) avoid 
the use of words such as "average." The two most useful 
rating scales are interval and ratio. Almost all the usual 
statistical measures are applicable to the interval scale 
unless knowledge of a "true" zero is required (30). The 
ratio scale requires the existance of true zero. All the 
Statistical measures applicable to the interval scale apply to 
the ratio scale, as well as geometric mean, coefficient of 
variation, and logarithmic scales. The behavior of a teacher 
may be rated on a direct magnitude estimation scale in 
order to develop à data basis for statistical manipulations. 
Achievement Measures and Categorizations 
Achieveme 


nt measures are indirect ways of evaluating 
teacher effe 


: ctiveness (2) and attract sufficient criticism 
from inferences often drawn from their application. 
While ratings and categories are criticized for problems 
arising from in ferring causality from correlations, achieve- 
ment measures tend to suffer from inappropriate inferences 
drawn from normalized sampling distributions. 
Categorizations, as in the “Variant-Flanders Interaction 
Analysis” (6:21), consist of attempts to describe and 
categorize teacher behavior. Observations are made during 
encounters” or probes set at specific time intervals. The 


trained observer, which may be an administrator or teacher 


viewing a video-tape, checks the various categories of behav- 


ic curring in a specifi i 
Pr occurring in a specific encounter. The categories are 


often arranged in groups, such as student-centered, subject- 
centered, student behavior, and teacher behavior. The 
total number of times a behavior is observed indicates 
the percentage of time spend in exhibiting the various 


behaviors (as in Openshaw (19), 80% teacher verbal 
behavior to 20% student verbal behavior). 
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Sorenson and Gross (29) used a categorical process 
which proceeded from the assumption that 


a teacher may be said to be "good" only when he satisfies 
someone's expectations, that people differ in what they 
expect from teachers, and that a scheme for evaluating 
teachers and for predicting their effectiveness must take 
into account categories related to instructional objectives, 
methods of instrüction and teacher relationships with pupils. 
Rosenshine (21) divided instruments used for teacher 
evaluation into category systems and rating systems, the 
difference between the two being the amount of inference 
required of the observer. Inference here refers to the proc- 
ess intervening between the objective data seen or heard 
and the coding of those data on an observational instru- 
ment. Categorization requires the least inference, and the 
encounters the teacher has are counted (e.g., teacher asks 
evaluative question) in a manner similar to event counting. 
Rosenshine’s ratings require the formation of inferences 
and are used to rate such qualities as enthusiasm and vigor. 


Event Counting 


Event counting is similar to categorization; however, 
it generally is used when a goal has been defined in terms 
of a behavioral objective. This form of evaluation is an 
integral part of meeting program objectives. Assuming 
that the rates or frequencies resulting from the application 
of behavior modification techniques are interpretable as 
measures of performance, and assuming that one believes 
it is possible and necessary to arrange contingencies and 
objectives for a teacher, then counting can be considered 
a teacher evaluation measure. One of the most appealing 
characteristics of event counting is that by setting specific 
behavioral objectives for the teacher, one eliminates the 
need for an inference to be drawn from the data collected. 


Applying the Measures 


Users 

The use of a trained observer is a desirable method of 
applying the four measures. A trained observer provides 
reliability and objectivity for the measure being utilized. 

In some cases, the extensive experience required for reliable 
application of an instrument (ega Variant-Flanders) P 
prohibits extensive use of the measure. An obs suec may ite 
used occasionally to establish “inter-rater reliability” in 
ratings, categorization, and event counting. 

Teacher (self) application of any of measures for 7 
teacher evaluation has the problem of relative objectivity. 
Generally, when the teacher rates himself/herself or 
uses a matching instrument, the resulting scores lean in 
the direction of the teacher’s self concept and are relatively 
high when compared with the scores of an observer. The 
teacher also has some problems with reliability in event 
counting if contigencies are not established in a pre-set 
format prior to the counting process. 


The student’s role in evaluation is often evidenced 
through ratings and rating scales. The Educational Feedback 
Center (EFC) at Western Michigan University is a system 
based on student ratings. A profile is produced which 
represents the average student’s reactions to questions 
believed to be related to teacher effectiveness. This proc- 
ess, as well as most processes involving student ratings, 
has not been determined applicable during the early 
elementary school years (EFC usually pertains to children 
in grades 7-12). 

An example of student perceptions of teachers using 
the “Teacher Image Questionnaire” from Western Michigan 
was compiled by William Coats (7), who did a factor 
analysis of 42,810 student responses in which 


a single factor, labeled teacher "charisma," was found to 
account for 61.5% of the variance in test items. Five 

other factors accounted for the balance. It was concluded 
that teacher charisma is probably a factor of teacher effec- 
tiveness, but that student ratings would best be used as only 
one part of a total evaluation . . . 


Criterion Behavior 


When any evaluation is initiated it is necessary to 
define "criterion behavior" or the reason for the need to 
evaluate. By utilizing the information found in Table 1, 
each user must decide on the criterion behavior or expected 
result to be evaluated using the various measures. In every 
case the measure is applied to some observable behavior. 
The inferences made from the observed behavior may lie on 
a continuum between valid and invalid. The most valid 
measure is event counting when the criterion is simply 
a predetermined rate of the observed behavior. Reduced 
validity develops as the observed behavior is separated 
from criterion behavior by inferences and/or theories. 

Earlier mention was made that the functional value 
of evaluation can be evidenced by the number of 
appropriate decisions. The type of decision required will 
determine the purpose and form of the evaluation. Relating 
this to the previous discussion of criterion behavior, the 
evaluation process and the quality of the decision is 
dependent on the appropriateness and the representativeness 
of the criterion chosen (20). 

Bolvin's (3) investigation of teacher performance in the 
area of prescription writing for various students illustrates 
how various criteria were related to prescription writing 
as a measure of effectiveness. Of interest in the Bolvin 
study was the rationale for choosing to evaluate the 
"prescription" as a reflection of teacher effectiveness. 
Prescriptions were chosen as one aspect of teacher activity 
that leaves a record. The evaluation was based on the 
teachers’ criterion for writing a specific prescription and 
the perceived constraints (i.e.. time, variety of materials, 
ete.), a key point being the critical need to identify and 

monitor the tangible evidence of teacher performance. 
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LIMITATIONS OF ANALYSIS OF COVARIANCE ON 
INTACT GROUP QUASI-EXPERIMENTAL DESIGNS 


PAUL A. GAMES 
The Pennsylvania State University 


ABSTRACT 


Multiple regression models are used to demonstrate that every organismic variable is to some extent a proxy for every pertinent missing 
organismic variable. For analysis of covariance, clear assessment of treatment effects is possible only when the treatment vector(s) is(are) 
kept orthogonal to al! organismic variables by random assignment of subjects. The "adjusted treatment effects" of covariance analysis 
on quasi-experimental designs include effects resulting from differences in the adjusted means of the treatment groups on pertinent 
organismic variables—bot/t those used as covariates and others that are missing from the analysis. Only if the adjusted treatment means 
do not differ in any of the organismic variables that are pertinent for predicting the criterion would the assessment of treatment effects 


be proper. 


COVARIANCE AS A TECHNIQUE that may be used 
to correct for confounding of organismic variables when 
subjects have not been randomly assigned to treatments 
was presented by McNemar (10:413-414) and Ferguson 
(9:326). Organismic variables are variables that may be 
obtained by measurement of subjects, but that are not 
assignable to subjects. Mental age, sex, ability measures, 
personality measures, past education, etc., are among the 
hundreds of interrelated organismic variables. Organismic 
variables may be contrasted to manipulatable variables, or 
treatments, that may be assigned to any subject. 

After the above texts were in press, Lord (8, 9) and 
Cronbach and Furby (1) questioned the use of covariance 
on groups that initially differ in one or more organismic 
variables. Evans and Anastasio (4) distinguished three 
logically different uses of covariance: Use one— where sub- 
jects have been randomly assigned to groups; Use two— 
where intact groups are assigned to treatments and covar- 


iance is used to “adjust” for differences between the group 
means on observed organismic variables; and Use three— 
where the differences in the covariate means are the result 
of different treatments (as when final trial learning meas- 
ures are used as covariates on retention scores). Evans and 
Anastasio argue against the last application, but met the 
second application if the covariate is measured beford the 
treatments are administered and the intact groups are ran- 


domly assigned to treatments. The present article argues 


that use two is also unlikely to lead to interpretable results. 


Multiple regression (MR) is a general data analysis tech- 
nique that includes all of analysis of variance (ANOVA) 
and analysis of covariance (ANCOVA) as special cases (6). 


For simplicity, we may use vectors of deviation scores, 

x= X a X as predictors, and y; = Y - Y as the criterion 

with means computed over all subjects. Or equivalently, 

we take the means as zero, with no loss of generality. 
The general model for three predictors is: 


pz EX ME NUM 
Yi= Bixi * Bx); arar es 


or, in vector terms, 


y=8,x, + Box, + Bata te 
The general test for whether any given variable makes 
nonchance contribution to predictive accuracy (given the 
other variables are used) is a test of Hy 
2 -R2 
R @ Ry 


sp A wj LPS Lynd 


B, = Oby i 
Q-Ry jap) Ap 1) 


where p 7 the total number of predictors. Since SS, = SS 
(regression 1...p) + SS, and SS (regression 1...p) = 


R$, SS,,, the same statistic may be formulated as 
Y. p ys y 


_ SS (add. reg. j) 
SS,/ (N- p- 1) 
where SS (add. reg. j) = SS (reg. L...p)~ SS (reg. L.j7 1, 


j + 1..p). Thus, in a three-variable problem, to test H,: 


.. SS (add. reg. 2) 
SS,/ (N- 3- 1) 


B, = 0,F 


[^] 
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is compared to F(a, 1, N- p - 1) where SS (add. reg. 2) - 
SS (reg. 123) - SS (reg. 13). This test requires the usual 
assumptions of classical MR (7:95); however, E(b;) = B, 
even when the normality and heterogeneity assumptions 
have been violated. 

In MR it is important to distinguish between orthogonal 
and nonorthogonal vectors. Two deviation score vectors 
Eam and x, are orthogonal when the covariance between 
them, 515 is zero. The condition of orthogonality creates 
great simplicity in MR. If we have a set of three mutually 
orthogonal predictors, 


2 
Dy.123 


= Py, + Ph, + Py 
and we may determine the independent contribution of 
each predictor to D os 

When orthogonality is absent, no independent contri- 
bution can be identified with the individual predictors (2). 
For the nonorthogonal case, [A is the bivariate regression 
slope that exists between y and the residual vector of 
x3 after x, and x, have been partialled out. That is, B, 
is the bivariate regression coefficient of a scatterdiagram, 
with y on the vertical axis and € 3.12 On the base axis; 
£4 dE 23 ja Where $51 is the predicted X4 from the 
multiple regression of x, and x, . Thus, [A is influenced 
not only by the yx, relationship, but also by both x, and 
Xj. 

It is well known that for the nonorthogonal case, B 
may be drastically changed if either x, orx, is dropped 
as a predictor. For orthogonal vectors, however, 
=z =f, Ë 
and the B, value remains the same whether x, and x, 

are included in the equation or not. The stability, sim- 

plicity, and clarity of the orthogonal case are greatly to be 
desired, but are rarely achieved with organismic variables 

as predictors. Organismic variables are “intrinsically non- 
orthogonal” in that any such variable has non-zero covar- 
iances with a large set of other organismic variables, The 
methodology of factor analysis has been created in an 
effort to make conceptual sense of matrices of such co- 
variances. 

In contrast, treatment main effects may be conceived 
of as “intrinsically orthogonal” to organismic variables in 
that any subject may be assigned to any given treatment 
level. With proper random assignment of subjects in ex- 
periments, there is no tendency for subjects with high x, 
values to be in A, and subjects with low x, values in A,. 
For simplicity, consider two independent groups of n cases 
(N = 2n) so that the treatment may be represented by a 
single vector A, where each subject in the experimental 
group is assigned as +1 and each subject in the control 
group is assigned as- 1. The A vector has a mean of zero, 
and may be treated as a deviation score vector. With ran- 
dom assignment of subjects to groups, then Cay. = 0 for 

F i 


JOURNAL OF EXPERIMENTAL EDUCATION 


any 2; and A is orthogonal to all possible organismic var- 
iables. 

The ANOVA model for this simple experiment may be 
formulated as y = BA +e. Here B, - (My p - Myc ) where 
E and C represent the experimental and control groups 
respectively. B, directly reflects the amount of treatment 
effect. The test of Ho: B, = 0 by F = SS (reg. AY MS; is 
an algebraic equivalent of the usual t-test of means of two 
independent groups. 

To obtain an ANCOVA model for this situation, we 
merely add additional organismic variables as predictors. 

In use one, these vectors are orthogonal to the A vector, 
hence B, is unaffected by the addition of the new variables, 
and is exactly the same as in the ANOVA analysis above. 
The model when two variables are used as covariates is: 


X8, + m, + A +e 
In sample data, chance variations from orthogonality may 
occur, and corresponding minor variations in b, may result; 
however, the same parameter is being estimated. The prime 
impact of the use of x, and X5 as covariates is that they 
should reduce MS; if they are indeed effective predictors 
of y. Now SS, = SSy - SS (reg. 124) so that if SS (reg. 124) 
is much larger than SS (reg. A), a substantial increase in 
power will occur. The major role of ANCOVA is to increase 
power over that from ANOVA. The treatment effect, B, , 
is not influenced by whether the organismic variables arc 
used or not in use one. 

In contrast to the above Situation, consider what hap- 
pens in use two when one intact school class is randomly 
assigned to the experimental group, and a different intact 
class is assigned to the control group. Since there is no 
randomization of subjects, it is likely that the two groups 
will differ in many organismic variables. The original differ- 
ence between Yp- Yç is not interpretable as an estimate 
of the treatment effect since it is partially confounded with 
organismic variable differences between the groups. The 
covariance between A and an organismic variable, x; is 
Pax, E S, p - Hee) 


When the x, means differ for the two groups, A is no longer 
orthogonal to xj. 


Evans and Anastasio defend use two partly on the basis 
that “the covariate differences among the groups should be 
relatively small” (4:228). This is a purely gratuitous as- 
sumption. If a school assigned students of a given grade 
randomly into two classes, the statement should be true, 
since two different teachers during the present term are 
unlikely to produce massive behavioral changes. However, 
if a school uses ability grouping, the differences between 
the intact groups may be substantial. One suspects use [v0 
of ANCOVA occurs most often when E's have encountered 
group differences that are not easv te, v E 


L4 
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If E obtains two organismic variables x, and Xo, and 
uses them in ANCOVA with intact groups, he is again 
using the model y = Box, + Bx, + BA + e, but 
with the A vector no longer orthogonal to x, or x, 

(The B’s used above again represent population values: the 
reason for different symbols will be apparent later.) Now 
B. will be influenced by whether x, and x, are in the 
equation. In fact, B, = 5 Gye -= f,c ) where the p’s 
are the "adjusted means” of y after x, and x, have been 
"partialled out." It is precisely 2B, that E will attempt 
lo interpret as the "treatment effect." 

'The dubious nature of this interpretation is shown by 
the following argument. Of the many possible organismic 
variables that may be obtained on the subjects, let us 
assume that only four organismic variables are needed to 
account for individual variation in y. That is, we shall 
assume a simple case where all other organismic variables 
have a B. = 0 once these four variables are included in the 
model. Ín addition, the treatment has an additive effect 
so that B, 7.5 (u E^ yc) is non-zero. Thus, we assume 
the TRUTH condition is that y = B, %, tÜx,* Bix, + 

+ +e. 
uR good insight into the situation, E has 
correctly identified x, and x, as pertinent variables, has 
obtained measures on them, and is using them as covariates. 
He has not obtained measures on x, or x,. Since x, and x 
are correlated with LAE and A, each of these variables 


4 


may be represented as linear functions of the observed 
variables 


where c, is SĀ, ET A, c» the adjusted mean difference 
3 CA 3 
i i NONE 
on X3 when x, and x, are partialled out, ete. Substituting 
these equivalences into the TRUTH equation, what is 
obtained is: 
É +e A+ 
y = Bix, * 5x, tB, (eux, * 05x; * Cad * 63) 


+B, (d x, *dyz, +454 *e,) * Bd *e 


ing te > iltiplicative constants 
Rearranging terms to place all multi] 


of x, together, etc., 
y 7 (8, *B,c, * B,d, )x, +B, *B5c * B4d,)x, 
* (8, * 5c, + Bada 4 * Byes * Paga +2) 


It is recalled that the original covariance model using just 
the x, and 1, variables was written 


r. +BA te 
t Bx, an i 


25 
It is clear that the B’s of the original model contain all the 
B terms derived from the TRUTH equation. Thus, the B's 
not only reflect the other organismic variables used as 
predictors, but they are also influenced by all other 
Pertinent organismic variables that are NOT used in the 
analysis, 


Speaking of the analysis of observational data, Tukey 
(11:118) says, “It is painful to recognize that . . . every 
measured variable serves more or less as a proxy for all 
those that are unmeasured, . . ." The present author 
would prefer to limit the above observation to only organ- 
ismic variables and nonorthogonal treatment variables. 

In this example of covariance, note that B by no means 
consists only of the treatment effect B,- Only if all other 
pertinent unmeasured organismic variables have zero 
values of c, (and d, etc.) is the observed covariance 
"treatment effect" B, equal to the true treatment effect 
B, . That is, for B, to be equal to B, , we must believe not 
only that the analysis accounted for the influence of x 
and x, on y, but also that x, and x, alone simultaneously 
remove all possible differences between the two group 
means in every other pertinent unmeasured x, variable. 

In reality, of course, almost any dependent behavior is 
influenced by dozens of organismic variables. Thus, there 
would be dozens of non-zero fi;s and dozens of “differences 
between the adjusted means of x, when the covariance 
variables have been partialled out” that would have to be 
zero before B, = B, . It strains credulity to believe that E, 
with the present state of the arts in education and psychol- 
ogy, can specify and accurately measure two or three co- 
variates that will reduce the Cy, d, terms of “all pertinent 
organismic variables” to zero. 

Note the source of this difficulty is that the quasi- 
experimental designs of use two fail to keep the treatment 
vector orthogonal to all other predictors. In the original 
three-vector covariance model of use one, E also had an 
incomplete model; £3 and Xa of the TRUTH were ignored. 
However, this in no way changed the value of B, . Since A 
is orthogonal to all four organismic variables, it is pre- 
cisely the same in the four-covariate model as in the two- 
covariate model as in the ANOVA no-covariate model. 
The regression coefficients of x, and x, would be influenced 
precisely as they are above, so that x, and x, would serve 
as proxies for the missing 33 and Lass but B, would be 
unaffected and the treatment effect correctly assessed. 

Note also that the argument has been phrased using 
parameter values so that no complications arising from 
sampling fluctuation have besmirched the picture. In ad- 
dition, it has not been necessary to raise the problem that 

the x,’s we use contain measurement error, while the MR 
models assumed fixed x's. Nor has it been necessary to 
invoke violations of assumptions (3). Operating under ideal 
conditions, use two of ANCOVA still produces con- 
founded results whenever E has ignored any pertinent 
organismie variables whose means differ from group to 
group after the covariate measures used have been par- 
tialled out. In contrast, in use one, ignoring heterogeneous 
regression slopes is comparable to ignoring a treatment by 
levels interaction. This would inflate SSy 5 and produce a 
conservative test (compared to that where the true model 
were known). However, the test still may be more power- 
ful than the ANOVA alternative. 
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Thus, the ANCOVA is a valuable, robust tool for 


improving the power of experimental designs where sub- 
jects are randomly assigned to treatments. It is not a 
miracle worker that can produce interpretable results 
from the quasi-experimental designs of use two. 
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ABSTRACT 
Previous research in locus of control suggested the hypothesis that internal subjects should perform better under mastery than 
under traditional assessment procedures, while the reverse should be true of externals. Two experiments were conducted using 
undergraduate and graduate subjects, Neither the LC nor the assessment procedure main effects were significant in either study, and 
no interaction was found with the undergraduates, With graduate subjects there was a significant interaction opposite in direction to 
expectations, Subjects overwhelmingly preferred the mastery procedures. These results are harmful to the construct validity of the 


LE Scale (9) and supportive of the mastery learning approach. 


ONE OF THE MOST SALIENT differences between 
mastery learning (1) and traditional educational practice 
is the amount of control exercised by the student over the 
educational process. Under a mastery approach the student 
can usually study at his own pace, decide when he is ready 
to test his mastery of the material, and determine to a large 
extent his own course grade. In contrast, under a traditional 
approach the student must perform more at the instructor’s 
rate and may have less control over his course grade, espe- 
cially if norm-referenced assessment is being used. The 
authors were interested in studying this situational dif- 
ference in the student's control over events important to 
him as it interacted with the personality construct of 
locus of control (LC). LC is conceived as a generalized 
expectancy regarding the control of one’s reinforcements 
(7). A person with an internal LC feels, in general, that 
he himself is in control of the delivery of his own rewards 
and punishments. A person with an external LC believes 
that his reinforcements are regulated by external forces 
such as luck, powerful others, fate, ete. 

Seeman and Evans (11) and Seeman (12) found that 

internals were more likely than externals to seek out 


information relevant to their needs. Lefcourt, Lewis, and 
Silverman (5), Rotter and Mulry (8), and Schneider (10) 
all reported finding that internals preferred, or took more 
seriously, situations in which they perceived themselves to 
be in control, and Watson and Baumal (16) found that 
internals made fewer errors in a perceived skill than in a 
perceived chance situation. The reverse findings were true 
of externals in each of these studies. 

In light of the above evidence, the authors hypothe- 
sized (a) that internal Ss would prefer an assessment sys- 
tem based on mastery learning to a traditional assessment 
approach, while the reverse would be true of externals; 
and (b) that internal Ss would perform better in a mas- 
tery learning than in a traditional assessment format, 
while the reverse would be true of externals. Thus, these 
research hypotheses provided a test of an aptitude by 
treatment interaction (2, 3). 


Method 


Two similar experiments were conducted to test the 
interaction hypotheses. Experiment I involved 76 under- 
graduate student teachers enrolled in a required course in 
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educational psychology, and Experiment II involved 44 
graduate students in a similar graduate level course. Both 
courses were designed and supervised by the second 
author, and both were divided by content into four con- 
secutive segments: classroom applications of reinforce- 
ment principles; the psychology of discipline; the relation- 
ships of beliefs and attitudes to behavior; and measurement 
and mastery learning theory. Examinations for each unit 
were scheduled at fixed times, and all students took the 
same form of the test at that time. For students in the 
traditional format, the score on that test constituted the 
basis for a letter grade on that unit. Students in the mas- 
tery format had to demonstrate competence in the unit, 
defined as achieving a score of 80% or more. If the stu- 
dent did not demonstrate competence, he was apprised 
of his areas of weakness by the instructor or a course 
assistant and helped to learn the material. When the stu- 
dent felt prepared to demonstrate his mastery of the 
material, he was given an alternate form of the same test. 
This process continued until the student achieved mastery. 


Experiment II also included a third assessment condi- 
tion, termed modified mastery, wherein Ss who failed 
initially to attain mastery of the unit were given the 
option of not restudying the material and not taking 
another mastery test. Such Ss could simply accept a C, 
say, rather than learn the material to the specified criter- 
ion. In this condition, then, students had even more con- 
trol over the conduct of the course than in the mastery 
condition. 


In Experiment I, Ss were assigned to take two seg- 
ments under the traditional course format and two under 
the mastery learning format. In Experiment II students 
were assigned to take one of the first three units of instruc- 
tion under the traditional course format, one under the 
mastery learning course format, and one under the modi- 
fied mastery course format. Ss were allowed to choose the 
format they preferred for the last unit. Experiment II 
analyses were based only on the first three units of instruc- 
tion, since the Ss were randomly assigned to conditions 
for those units only. 


All students were pre- and post-tested on an instrument 
which covered all four units of instruction, and which 
included a number of items assessing attitudes toward the 
subject matter and teaching. The I.E Scale (9) was admin- 
istered during the pre-test to measure LC. Each student's 
standard score on the section of the post-test correspond- 
ing to the unit he took under each assessment condition 
was employed as the dependent variable. LC was a 
between-subjects factors, while assessment condition was 
a within-subjects factor. The analyses were performed in 

accordance with procedures outlined by Finn (4) and 
elaborated by Peng (6) for designs which employ corre- 
lated groups. 


Results 


Both the I-E Scale and the post-test instrument showed 
adequate reliability in both experiments (I-E Scale: I = .79, 
II = .81; post-test: J = .59, II = .74). The students, regard- 
less of LC group, showed an overwhelming verbal prefer- 
ence for the mastery assessment procedures (I = 68%, 

II = 70%) over either the modified mastery (I = 26%, 

II = 30%) or the traditional (I & II = 0%) procedures. Since 
the Experiment I Ss did not themselves experience the 
modified mastery procedure, it was presented as a hypo- 
thetical alternative. In Experiment II, we had a strong 
behavioral measure of assessment procedure preference, 
since the students were allowed to choose the format they 
preferred for the last unit. Twenty-five (57%) chose the 
mastery procedures, eighteen (4176) chose the modified 
mastery conditions, and one (296) chose the traditional 
assessment procedure. It is believed this decisive prefer- 
ence for the mastery approach should carry some weight 
with course planners. 


In Experiment I, the scores on each subtest of the pre- 
and post-tests were standardized, and each student was 
assigned a pre-test mastery score, a pre-test traditional 
Score, a post-test mastery score, and a post-test traditional 
score by combining his standard scores on the two sub- 
tests of each instrument corresponding to the units of 
instruction taken under mastery or traditional course for- 
mat. The I-E Scale scores were trichotomized so that 
scores of 9 or less indicated internality (N = 25), scores 
between 10 and 14 indicated neither internality nor 


externality (N - 28), and scores of 15 or more indicated 
externality (N = 23), 


To test for the interaction of LC and course format, 
the difference was calculated between each S's post-test 
mastery score and his post-test traditional score. A similar 
difference score was also calculated for each S's pre-test 
scores. A one-way analysis of covariance was conducted 
over the three levels of LC on the post-test difference 
between mastery and traditional conditions, using the com- 
parable pre-test difference score as a covariate. The results 
indicated that the pre-test difference (covariate) was not 
significantly related to the post-test difference (F < 1). This 
was expected, since there was no reason to believe Ss’ pre- 
test difference scores should be in any way related to post- 
test difference scores, The interaction of LC and course 
format also yielded F< 1, which did not support the 
research hypothesis of this study 


To test for a main effect of mastery versus traditional 
course format, the mean of the post-test difference scores 
for all 76 Ss was tested to see if it was significantly differ- 
ent from zero. This test yielded F = 3.3; df = 1,72; p= 07. 
This nearly significant result may have occurred because 
Ss had more experience with the material in the mastery 
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format units, since they often took a number of tests on 
those units. 

To test for a main effect of LC, the post-test mastery 
score was combined with the post-test traditional score, 
and a one-way analysis of covariance was conducted on 
post-test scores, using the comparable sum of pre-test 
scores as a covariate. In this case, the covariate was signifi- 
cantly related to the criterion scores. Pre-test scores 
accounted for 2076 of the variance in post-test scores 
(r = .45; F = 17.9; df = 1, 72; p € .01). However, LC was 
not significantly related to adjusted scores on the post-test 
(F< 1). Tables 1 and 2 summarize these results. 

In Experiment II, the scores on the first three subtests 
were again standardized within subtests and across all Ss, 
and each S received a traditional, a modified mastery, and 
a mastery score corresponding to his standard scores for 
the approprate instructional units. This was done for both 
pre- and post-tests. Since there were fewer Ss in this exper- 
iment, they were simply dichotomized on the LE Scale 
(rather than trichotomized as before) into internals with 
I-E scores of 11 or less (N = 23), and externals with scores 
of 12 or more (N = 21). 

The logic of the analysis was exactly the same for this 
experiment as for the earlier one. However, since in this 
experiment there were two degrees of freedom for the 


. course format factor, two difference scores (mastery versus 


traditional, and modified mastery versus traditional) were 


Table 2.— Experiment I Analyses of Covariance 


Constant term 
Between groups 


Covariate 
rz. 


Error 


Table 1.—Experiment I Cell Means and Standard Deviations 


Traditional 
Post 


Source 


Pre 


Internals (N = 25) 
X 10 -.48 


SD 1.81 1.31 
Moderates (V = 28) 

x .03 01 

SD 1.43 1.55 
Externals (N = 23) 

x 34 .04 

SD 1.46 1.54 


used simultaneously as a multivariate set of dependent 
variables in order to test for a LC X format interaction and 
to test for a course format main effect. The test for a LC 
main effect was again a univariate test employing the sum 
of scores under all experimental conditions as the depend- 
ent variable. Tables 3 and 4 summarize these results. 
Again, the pre-test differences between scores for those 
units taken under mastery conditions and scores for units 
taken under traditional conditions were found to be un- 
related to the same post-test differences. Hence, it was un- 
necessary to employ, as the authors did, such pre-test dif- 
ference scores as covariates. The multivariate test of the 


Effect tested 


Mastery vs. traditional main effect 


Treatment X LC interaction 


*Dependent variable is the difference between mastery and traditional scores on the 


post-test. 


Covariate is the difference between mastery and traditional scores on the pre-test. 


Source** 


Between groups 


Covariate 75.18 


r=.45 
4.19 


Error 72 


17.93 .001 


Effect tested 


Locus of control main effect 


**Dependent variable is the sum of the mastery and traditional scores on the post-test. 
Covariate is the sum of mastery and traditional scores on the pre-test. 


(000mm GN 
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LC X course format interaction was marginally significant in the theory of LC would suggest that this should be the 
(F = 3.0: df = 2,89: p < .06), and the univariate tests on case, In Experiment I no interaction was found, and in 

- 3.0: 2,99: MERO ted: 
each of the post-test difference scores, using the appro- Experiment II an interaction opposite in direction to the 


priate pre-test difference scores as covariates, were clearly 


hypothesis was found. These contradictory results suggest 
significant (Fs 


5.2 and 6.1:df = 1, 41; ps< .04 and .02). a need for replication, but both experiments agreed in 
However, as shown in Figure 1 (the means graphed in Fig- 
ure | are the uncorrected means, since the covariates were 
not effective in the test of the interaction), this significant 
interaction was opposite in direction to the hypothesis! The 
stronger the external control of the course, the better the 
internals did. Contrarily, the greater the opportunity for 
self-direction, the better the externals performed. N 


failing to confirm the hypothesis. In neither experiment 
were there any significant differences between LC groups 
with respect to preference for assessment. procedures. Over- 
whelming preference for the mastery approach was the 
rule regardless of I-E Scale score. 

The cognitive impact of the course was demonstrated 
othing by highly significant (p <.01) changes in performance 


Table 3.—Experiment II Cell Means and Standard Deviations 


Modified mastery 


Traditional 
Pre Post 


Pre Post 


Internals (N = 23) 
X 
SD 


Externals (N = 21) 
x 


Univariate 
df MS F p< 


Source* 


Effect tested 
Constant term: 


M-T 
MM-T 


2,39 18 84 


Treatment main effect 


Between groups: 
M-T 
MM-T 


2,39 3.04 06 


Treatment X LC interaction 
Covars: 

M-T Mult. r= .32 

MM-T Mult. r = .21 


4,78 1.20 32 


Error: 
M-T 


Source** 


Between groups 


Covariate 
r=,32 


Error 


**Dependent variable is the s 


um of post-test scores under all three iti 
Covariate is the sum of pre- conditions, 


test scores under all three conditions, 
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(2.5, to 3.0 pre-test standard deviations) from pre- to post- 
test. The affective impact of the course was apparent in 

a significant (p < .01) positive shift—sign test (13)—in the 
attitudes of these Ss toward the concepts and principles 
of the course and their applications to teaching. 

Thus, while the course had powerful cognitive and 
affective effects, neither LC nor assessment condition had 
a significant effect, and the hypothesized interaction failed 
lo appear. 

Discussion 


Several explanations may be advanced to account for the 
data. Originally, those who promulgated the LC construct 
hypothesized that it would be strongly related to n-achieve- 
ment (7), which, one would expect, would lead to school 
achievement. Perhaps, however, LC simply is not a powerful 
variable in school situations. Rotter (9) and Warchime (15) 
have suggested as much in efforts to account for the fact 
that the LE Scale seems to be unrelated to school grade 
point average. The hypothesis of a relationship between LC 
and n-achievement has also fared poorly. Wolk and Du- 
Cette (17) found no significant correlation between the 
LE Scale and two measures of n-achievement in two samples 
of Ss. 

Another possible explanation of the findings is that the 
LE Scale assesses socio-political attitudes rather than an 
underlying personality dimension with motivational con- 
sequences. The responses to the I-E Scale which indicate 
an internal LC usually emphasize individualism and suc- 


TRADITIONAL 


cess through hard work. Such responses should be congenial 
to those of conservative socio-political philosophy. On the 
other hand, external responses often emphasize collectivism 
and common oppression by greater powers. These res- 
ponses probably fit well in the world-view of many liberal 
thinkers. Indeed, Thomas (14) found that although his 
sample of 30 liberals was more politically active than his 
sample of 30 conservatives, the liberals were significantly 
more external than the conservatives. 

If the LE Scale measures socio-political philosophy, the 
interaction found in the second experiment is readily ex- 
plained. If the externals are liberals, they should prefer the 
more liberal course formats, while the conservative internals 
should prefer the traditional instructional methods. This is 
exactly what was found in Experiment II. 

However interpreted, the results of these experiments are 
damaging to the construct validity of the I-E Scale. Further 
experimentation should be undertaken to resolve the dis- 
crepancies between the results of the two studies, but there 
is no evidence in either experiment of the interaction pre- 
dicted by LC theory. 

The finding of most importance for education was that 
both undergraduate and graduate students showed an over- 
whelming preference for the mastery learning format. 
Since the students learned the material equally well under 
all of the assessment procedures, the authors believe this 
result argues strongly in favor of the mastery learning 

approach, 
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Figure 1.—Post-test performances of internals and externals in the 


different assessment conditions 
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NOTE 


1. Appreciation is expressed to Dr. Howard Kight and Mr. 
Bruce Kestleman for their suggestions; to the students in these 
courses for their willingness to be part of an experiment; and to 
the following course assistants for helping to make these courses 
successful: John Dilendik, Marilyn Dozoretz, Murial Frank, Kath- 
leen Van Every, Marvin Lew, Alfred Sarnowski, Joseph Zampogna, 
and Margaret Zabranskey. An earlier version of this paper was 

presented on April 2, 1975, at the American Educational Research 
Association Annual Meeting in Washington, D. C. 
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ATTITUDE-APTITUDE RELATIONSHIPS IN 
THE QUANTITATIVE DOMAIN: 


A CANONICAL ANALYSIS 


ANDREW G. BEAN 
CATHLEEN KUBINIEC MAYERBERG 
Temple University 


The purpose of this study was to investigate the relationsh 
verbal and quantitative aptitude. Subjects were 353 graduate s 


icant canonical variates were obtained. The first showed a modera 


and high verbal aptitude. 


MOST MEASURES OF "attitude toward mathematics? 
consist of a single scale based on items which sample a va- 
riely of attitudes toward di fferent aspects of mathematics 
rather than focusing on some specific part of the subject. 
Such generalized instruments may fail Lo measure impor- 
lant facets of the variable of interest (2). In Support of this 
idea, Mayerberg and Bean (5) reported data indic 
attitude toward different mathematics-related concepts 
(e. g, Algebra, Statistics, Calculations, Formulas, ete.) could 
be considered as multidimensional. They suggested use of 
the more general term “attitude toward quantitative 


ating that 


con- 
cepts” instead of “attitude toward mathematics” to describe 
this domain. 

Since aptitude is related to achievement, and since achieve- 
ment affects attitude and vice versa, one would expect at- 
titude to be related to aptitude. Aiken (2) cites several 
studies which report low to moderate correlations þe 
attitudes toward mathematics and measures of 
aptitude. 

In all such studies, however, attitude was measured by a 


tween 
scholastic 


single scale. Needed, then, isa study which examines the re- 
lationship between the various aspects of attitude 


quantitative concepts and scholastic aptitude. The 


toward 
purpose 


i tes y b how te relationship between positive attitudes toward quantitative con- 
cepts and high quantitative aptitude. The second indicated a slight relationship betw 


veen negative attitudes toward quantitative concepts 


of this study was to investigate the relationships among nine 
relatively independent factors of altitude toward quantita- 
live concepts and measures of quantitative and verbal apti- 
tudes, 

Studies investigating attitude-aptitude relationships in the 
quantitative domain s 


Upport the generalization that attitude 
toward mathe 


matics is more closely related to quantitative 
aptitude than to verbal aptitude, For example, ina sample 
40 college students, Dreger and Aiken (3) found that anxiety 
toward mathematics had a correlation or —.25 with the 
American Council on Education (ACE) quantitative score 
and —08 with ACE linguistic score, Aiken (1) also found 

à correlation of .37 between Mathematics Attitude Scale 
(MAS) score and Scholastic Aptitude Test (SAT) quantita- 
live score, but no significant correlation with SAT 

verbal score, 

The relationship between mathematics s it-concept and 
various aplitudinal variables has also been studied. Using 4 
sample of seventh-grade students, Holly et al. (4) reported a 
a significant correlation of .47 between scores on the Math- 
ematics Self-Concept Scale (MSCS) and the Comprehensive 
Test of Basic Skills Pretest in Mathematics. Correlations o! 
similar Magnitude were obtained between the MSCS and 


—K 
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other measures of mathematics aptitude, compared with a 
correlation of .32 with the Lorge-Thorndike Intelligence 
Test Verbal I. Q. Thus, from two related perspectives— 
attitude Loward the subject matter and perception of one's 
ability in the subject matter—there is evidence of a relation- 
ship between aptitude in and attitude toward mathematics. 
From the above studies, one would expect measures of 
attitude toward quantitative concepts to be moderately re- 
lated to quantitative aptitude; furthermore, they should be 
more highly related to quantitative aptitude than to verbal 
aptitude. Thus, it is hypothesized that a canonical correla- 
tion analysis relating several relatively independent measures 
of attitude toward quantitative concepts to measures of ver- 
bal and quantitative aptitude will result in a single significant 
canonical variate. All of the attitude measures as well as the 
quantitative aptitude measure will be highly correlated with 
this first variate. That is, it is predicted that only one variate 
ationships between the 


is necessary to explain the inter-rel 
attitude and aptitude domains. The nature of this variate 
will be related to quantitative aptitude, but not to verbal 


aptitude. 


Method 

Subjects consisted of students enrolled in one of three 
graduate-level courses in educational research and statistics 
offered within the College of Education of a large urban 
university. To obtain an adequate sample size, data were 
gathered {rom all students enrolled in these courses for four 
different semesters. The total sample size was 353 (175 
males and 178 females). 
Variables i 

The nine attitude measures used were obtained by having 
subjects respond to a semantic differential measuring instru- 
ment containing six quantitative concepts and fourteen bi- 
polar adjective scales. Three of the concepts (Algebra, Sta- 
tistics, and Mathematics) reflected substantive areas in the 
quantitative domain; the remaining three concepts (Numbers, 
Calculations, and Formulas) reflected tools employed in 
that quantitative domain. 

The six concepts were 
bipolar adjective scales: (1) enjoyable 
tractive-repellent; (3) in teresting-boring: 
pleasant; (5) valuable-worthless; (6) useful-useless: (7) im- 
portant-unimportant; (8) simple-complex: (9) casy-difficult; 
(10) lucid-obscure; (11) clear-hazy; (12) meaningful-mean- 
ingless; (13) intelligible-unintelligible; and (14) good-bad. 


basis of two criteria: 


rated on the following fourteen 
-unenjoyable; (2) at- 
(4) pleasant-un- 


These adjectives were selected on the 
(1) a meaningful relationship to the concepts rated, and 
(2) a presumably heavy loading on the evaluative meaning 
dimension. 

From a principal factor analysis with oblique rotation 
for simple loadings, nine factors were obtained and labeled 
as follows: (1) Algebra; (2) Statistics: (3) Calculations; 


(4) Formulas; (5) Numbers and Mathematics; (6) Useful; 
(7) Easy: (8) Clear; and (9) Good. The first five factors re- 
flect attitude toward a specific quantitative concept. For 
example, a person with a high score on the first factor could 
be described as having a positive attitude toward Algebra, 
describing it as enjoyable, interesting, etc. The remaining 
four factors represent a generalized attitude toward the en- 
tire domain of quantitative concepts. For example, a person 
with a high score on the sixth factor describes all of the 
quantitative concepts as "Useful." 

Correlations among the nine factors were generally low 
and positive. A thorough description of these factors, along 
with evidence of their reliability and construct validity, is 
given in Mayerberg and Bean (5). 

Factor scores for each of the nine factors described above 
served as the attitude measures used in the study. The meas- 
ures of scholastic aptitude consisted of Graduate Record 
Examination Verbal (GREV) and Quantitative (GREQ) 


scores. 


Data Analysis 


Canonical correlation analysis was used to relate the pre- 
dictor set of attitude measures to the criterion set of apti- 
tude measures. To aid in the interpretation of the canoni- 
cal variates, the variable-variate correlation matrix was com- 
puted, providing canonical component loadings (6). 


Results 


Means, standard deviations, and correlations among the 
predictors and the criteria were computed separately for 
cach sex. Since the correlations for males and females were 
similar, all of the data presented here are based on the sexes 
combined. 

The attitude factors were in z-score form; thus, all atti- 
tude measures had a mean of zero and a standard deviation 
of one. GREV and G REQ means were 547 and 515 respec- 
tively; the standard deviations were 104 and 102, respec- 
tively. 

Table 1 presents the correlations among the predictors 
and the criteria. As stated previously, correlations among 

the attitude measures were generally low and positive. All 
correlations of the attitude measures with GREQ were pos- 
itive and statistically significant at the .05 level. Three fac- 
tors, “Algebra,” Easy," and “Clear” showed correlations 
of from .35 to .39. Thus, positive attitudes toward quanti- 
tative concepts were associated with high quantitative apti- 
tude. 

The magnitude of correlations between the attitude 
measures and GREV were quite low, with the only statis- 
tically significant correlations being negative. Four factors, 

“Numbers and Mathematics,” “Calculations,” “Easy,” and 
“Good” showed marginally significant correlations ranging 


from —.17 to —.11. 


———————__ NUNUIMMMMNIRTUUEETUEUEEEEEEEIUMHME 


Table 1.— intercorrelations among Attitude and Aptitude Measures 


Attitude Measures 
Attituae Measures 


1. Algebra e32 +21 43 — 30 
2. Statistics *26 .37 — .08 
3.‘ Calculations 032 017 
4. Formulas Du 


5. Numbers & Math 
6. Useful 

7. Easy 

8. Clear 

9. Good 

Aptitude Measures 


10. GREV 


11. GREQ 


Note: For N=353, a correlation ef . 1 or 


at the 


Canonical correlation analysis of the relationship between 
the attitude factors and the aptitude measures yielded two 
significant canonical variates, The canonical correlations of 
-55 and .30 both were significant at the .0] level. 

The two canonical variates may be interpreted by exam- 
ining the canonical component loadings shown in Table 2, 
All attitude measures have positive correlations exceeding 
-30 with the first variate. G REQ loads .93 on the first yar- 
iate, while GREV shows a near-zero relationship. 

Thus, the first variate is consistent with the expectation 
that positive attitudes toward quantitative concepts are as- 


:05 level, using a two-tailed test, 
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0/427. 31.327. 00 9 
e20 26 28 27 -.05 °23 
e17 30 a24 24 2,44 -16 
e25 438. a34 a30 -.01 27 
e0? 22 .26 23 2,7 14 
=.12 25 04507 .20 

“40 612 -.11 035 

033 .07 037 

-.12 :15 


.30 


great er is statistically significant 


sociated with high quantitative aptitude. Attitude mam, 
Contributing most to the first variate (specifically those wi 

loadings above .50) are positive attitudes toward “Algebra té 
and "Formulas" and the attitudes that quantitative concep” 
are “Easy” and “ 


Clear.” The first canonical correlation 9 
-59 indicates a sl 


lared variance between the predictor and 
criterion set of approximately 30 percent. l 
Contrary lo the original hypothesis, a second statistically 
Significant vector. was found. Three attitude factors show n- 
negative loadings with absolute values exceeding .30. duc 
bers and Mathematics? shows the closest relationship to the 
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Table 2.—Canonical Component Loadings for Two Canonical Variates 


Predictor Variables 
Algebra 

Statistics 
Calculations 
Formulas 

Numbers & Math 


Useful 
Easy 


Clear 
Good 
Criterion Variables 
GREV 
GREQ 


Canonical R 


°73 .11 


-47 -.11 
:39 -.21 
053 202 


ae ER ee ier s cue neg 


** p « .01 


attitude toward 
ated with higher 
lation accounts 


second variate (—. 53). Thus, negative 
“Numbers and Mathematics" is associ 
GREV scores. The second canonical corre 
for approximately 9 percent of the shared variat 


the predictor and criterion set. 


nce between 


Discussio 
n 
attitude to- 


Canonical analysis relating nine measures of 
alive apti- 


Wi — a 
i ard quantitative concepts to verbal and quantit 
de resulted in two significant canonical variates. In agree- 


ment with the original hypothesis, the first canonical variate 
related positive altitudes toward quantitative concepts pri- 
marily to high quantitative aptitude. lt is reasonable to ex- 
pect that high loadings on this first variate would occur for 


“Algebra” and “Formulas,” since these skills are important 


in obtaining a high G REQ score. Similarly, persons expres- 
sing the attitude that quantitative concepts are “Easy” and 
“Clear” are likely to say so because of previous successes in 
quantitative performance. 
The judgment that quantitative concepts are “Useful” is 


not strongly related to GREQ. Such a result is not surprising 


———————— -= = @£& ©. 
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since attitudes about the usefulness of quantitative concepts 
are not strongly related to the perceived ease and clarity of 
these same concepts (see Table 1). The relatively low corre- 
lation between attitude toward “Number and Mathematics” 
and GREQ is somewhat unexpected. More will be said about 
this below. 

The second canonical variate was essentially a negative re- 
lationship between “Numbers and Mathematics” and verbal 
aptitude. A theoretical explanation for this finding is not 
easily constructed. One possible empirical explanation for 
this finding could be that the “Numbers and Mathematics” 
factor lacks construct validity. Results of the factor analysis 
used to create the nine attitude factor scores indicated that 

this factor was the least “clean” in terms ôf factor structure, 
It could be that the concept “Numbers and Mathematics” 
embedded in an instrument containing more specific con- 
cepts such as “Algebra” and “Statistics” is too broad and 
ambigious to provide useful attitude measures, 

In future studies, these attitude meas 

as predictors of achievement in c 
tative topics. The moderate degree of relationship between at- 
titude and aptitude allows for the possibility that altitude 

can be used to increase predictive validity 
using aptitude measures alone. If the relationship between 
attitude and aptitude 
information would be 
a predictor set already c 
aptitude. 


ures can be examined 
OUurses dealing with quanti- 


In summary, positive attitudes toward quantitative con- 


cepts were found to be moderately related to Uri sis 1 
aptitude. Positive attitudes toward "Algebra" and ‘ Formu as 
and the attitudes that quantitative concepts are “Easy” and 
"Clear" are the four measures most closely related to high 
quantitative aptitude. Other attitude measures, notably a 
positive attitude toward “Numbers and Mathematics,” have 

a small but statistically significant relationship to lower ver- 
bal aptitude. 
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PERFORMANCE OF READABILITY FORMULAS . 


UNDER CONDITIONS OF RESTRICTED ABILITY LEVEL 
AA RESTRICTED DIEN CULT OF MATRA Tea 


NELSON RODRIGUEZ T. 
Caracas, Venezuela 


LEE H. HANSEN M S 
Madison, Wisconsin Public Scho? 


ABSTRACT 


data gathered on textbook 
Generally, it was found that narrow-band formulas designed for 


READABILITY FORMULAS are 


predictive devices that 
provide quantitative estimates 


of the relative difficulty of 
pieces of writing. The general purpose for their use is to 
estimate the probable success a reader will have in under- 
standing a set of materials without requiring the reader to 
take tests of any kind (4). 


» Newspaper, and leisu 


material of restricted difficulty 
eral formulas, 


Formulas 
subjects of varying ability and 
ability formula could be obtained 
rmuth, the authors constructed 
ire reading passages administered to seventh graders. 
and subjects with a narrower range 


The underlying assumption of readability formulas 15 
that the difficulty of a pi e 
least partially, 
content, style, 
style elements 
in the developr 


ece of writing is determined, ae 
by elements contained in the writing itse! 
print, ete. Most readability formulas use 
for their predictions. The general proceduri, 
nent of readability formulas is the follow”? 


UE 
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1. A set of plausible readability variables is generated 
on the basis of previous experience or of linguistic 
research. 

2. Those variables are computed for a set of passages. 

3. For a sample of individuals from the target population, 
scores are obtained in a criterion of comprehension, 
usually a test based on the passages. 

4. Finally, the formula is computed by regressing the 
readability variables on the criterion score. 

Once the formula is obtained, it can be used with subjects 

of similar characteristics as those in the normalization 

sample to predict their criterion scores, and, indirectly, 
their comprehension of the materials. 

Klare (4) presented a developmental history of readabil- 
ity formulas up to the 1960s. Although some improve- 
ments in computation and cross-validation procedures had 
previously been reported, a real breakthrough occurred 
with the development of the cloze procedure, recent 
advances in linguistic research, and the development of 
more powerful computer soft- and hardware (1). The 
cloze procedure allowed for a more objective external 
criterion and more powerful psychometric techniques; the 
advances in linguistic research permitted the development 
of variables that were unknown before; the advances in 
computer technology allowed for more complex data analy- 
sis at a reasonable cost. 

Two documents by Bormuth (1, 2) report attempts to 
incorporate both cloze tests and new linguistic variables 
into readability research. He reports formulas that reach 
an unprecedented degree of accuracy , not only in the stand- 
ardization samples, but also in cross-validation. 

The present study was undertaken under the following 
rationale. In his normative and cross-validation studies, 
Bormuth used samples of materials of wide-ranging diffi- 

s; his samples of students were also from a wide 

ability range. Under those circumstances, he determined 

what could be considered a ceiling in the prediction ability 

of his formulas. It would be interesting to determine a 

“floor” by restricting the range of the materials as well as 

the ability of the students. This situation is provided by 

testing students from one single grade level on materials 
from one single subject area. Although within one grade 

of ability, that range will 

two or more grade 


culty level 


level there is usually a wide range 
obviously be more restricted than across 
usually confronting the 
individual teacher when she has to decide what materials 
lents in her class. The 


levels. The former is the situation 


to assign for a specific group of stuc l 
teacher would presumably randomly select passages from 
the books, compute the variables required by the formula, 
and obtain an estimate of their difficuly. This, together 
with her knowledge of the students? reading ability (from 
tests, inventories, and direct observation), would allow her 
to match students with materials. This is also the situation 
that school systems confront when deciding what materials 
to purchase for specific grade levels. 

In this context, that is, given a restricted level of ability 


e 


and a restricted range of difficulty of the materials, the 
following questions are of interest: 


1. How well would Bormuth’s formulas predict mean 
(cloze) difficulty? 

2. If a set of variables originally tested by Bormuth were 

^d under these circumstances, would the same 
variables enter a multiple regression equation? 

3. How well would the set of variables entering the equation 
predict the (cloze) mean? 

A. As contrasted with Bormuth's formulas that were devel- 


oped under other circumstances, how well would a 

formula developed in a sample of restricted ability and 

difficulty ranges predict (cloze) difficulty in other sets 
of materials? 

In this project two studies were performed. The first one 
attempts to answer the first three questions; that is, (1) to 
determine how well Bormuth’s simple formulas perform in 
this case; (2) to suggest which variables are the best single 
set of predictors; and (3) to provide a multiple regression 
equation. The second study cross-validates Bormuths’s for- 
mulas and the formula obtained in the first study in two 
other sets of materials for purposes of comparison. 


Procedure 


The data used in this project originated from an assess- 
ment of reading literacy performed in the Public School 
System of Madison, Wisconsin. In that project a large num- 
ber of fourth, seventh, tenth, and twelfth graders were tested 
using 10-item cloze tests developed on 60- to 70-word pas- 
sages randomly selected from a predefined universe of mate- 
rials that the students were supposed to be able to read. That 
universe included several domains, among others—safety 
materials, occupational information, textbooks, leisure- 
time reading, consumer materials, etc. (3). For each passage, 


the mean cloze score and some additional information was 
available. 

For the present project, the seventh grade data were 
selected, beginning with textbook, leisure-time materials, 
and newspapers. There were respectively 45, 36, and 23 
passages available in those domains. For each passage a set 
of linguistic variables was computed. They were used as 
independent variables to predict mean cloze scores. 

All the variables used in this project were developed and 
used by Bormuth in his 1966 and 1969 readability studies? 
They were included in this project for one of the following 
three reasons: 

1. They had entered the stepwise multiple regression equa- 
tion in Bormuth’s studies. 

2. They did not enter the equations, but showed a high 
correlation with the criterion. 

3. They were included in the manual computation formulas 
developed by Bormuth. 

The second set of variables was included because sam- 

pling errors sometime prevent a variable from entering an 


— RN 
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Table 1.— Variables Selected for the Present Study from Bormuth's 
1966 and 1969 Studies and His Readability Formulas 


NSS SSS 


Variable 1966 1969 Formula 


Letter/independent clause x 
Nouns/structural words x 
Letter/words x 
Pronouns/conjunctions x 
Syllables/sentences 

Syllables/words 

Personal pronouns/words 

Dale-Chall List 769/words 

Dale-Chall List 3000/words 
Words/sentences 

Letters/words 


WoW OX KKK xw x 
x eK KR RK KH 


Letters/meaningful Punctuation unit 
Syllables/independent clause x* 
Words/independent clause x* 
Structural words/nouns x* 
Adjectives/structural words ze 
Anaphora/words x* 
Letters/syllables x 


Structural words/adjectives x* 


* Denotes "promising variables,” See text for explanation, 


equation. Under the changed circumstances of this project, 
it was considered that other variables would enter the 
equations. Table 1 lists the variables and the study in which 
they originated. Marked with an asterisk are “promising 
variables," that is, those considered in item number 2 above, 


Study I 


In this study 45 passages from textbook materials were 
used. Three statistical analyses were performed: 


1. By using Bormuth’s formulas, four predicted means were 
computed for each passage. A zero order correlation 
was then computed between predicted and obtained 
mean cloze scores, 

- A stepwise multiple regression equation Was computed 
with nineteen dependent variables (Table 1) ang one 
independent variable (mean cloze). The program was 
allowed to run forward and unrestricted, 

3. The same program (as in 2) was run but restricte 

levels of significance of .05 for inclusion and .25 
exclusion. 


bo 


d at the 
for 


Results and Discussion 


In considering the results of this study, it must be under- 
stood that the present project differs from Boi 


s rmuth’s studies 
in at least two main points. First, this study u 


Ses a more 


restricted range of passage difficulty as well as ability level 
of the subjects; second, the average passages used in this 
study are approximately one-fourth of the length of the 
ones used by Bormuth (70 words in this study as compared 
Lo 287 words in Bormuth's 1966 study and 110 in the 1969 
study). Furthermore, Bormuth used five cloze forms for 
cach passage, including, by so doing, cach possible word as 
a cloze item; this project used only a random sample of 
20% of the words. The first one isa “builtin” differ- 

ence; the second results from using available data. 


How well do Bormuth 5 formulas perform in this situation? 

Table 2 shows the zero order correlation coefficients 
between predicted and obtained mean cloze scores. 

The results show what can be considered a floor value for 
Bormuth's formulas. Formula 3 seems to be performing 
better than the others, a result that was confirmed in Study 
IL All correlations reached significance at the .01 level 
except Formula 1. When comparing these results with 
Bormuth's eross-validation studies, a drop in the correlations 
is observed from around -90 to around .40. This decrease in 
validity can be due either to the restricted range of difficulty 
and ability or to the shorter length of the passages. An 
inspection of the raw data shows that at least one of the 
variables included in the formulas, personal pronouns per 
word, was absent in many of the passages. This suggests the 
likelihood of a lower reliability of the linguistic variables 
included. Furthermore, in his studies Bormuth found a 
certain degree of curvilincarity in the scatter plots of 
expected versus obtained difficulties in his cross-validation 
Studies. It is possible that for shorter passages the relation- 
ship between linguistic variables and cloze scores is curvi- 
lincar; this would also contribute to reducing the correlation 
coefficients. This possibility is supported by Bormuth’s 
finding (1) that at the level of independent clause many 
variables show curviline 


arily, whereas at passage level, am 
linearity tended to disappear; this suggests that with increas 
length in the passages, the relationships are linear. 

Asa continuation for this study, it is suggested that a 
study be performed using longer passages (250* words) but 
restricting, as in this project, ability range and difficulty- 

If it is demonstrated that part of the reduction in predict- 
ability can be attributed to Passage length, it would be ial 
necessary to use longer passages in the evaluation of mater! 


: " jon 
Would the same variables enter a stepwise multiple regress! 
equation? " 
th? 


" . H H u 

The list of variables that entered the equations in pa 
studies are listed in Table 1. Table 3 summarizes the steps ! 
the unrestricted multiple 


j regression program run in this 
project. 


The stepwise regression procedure selects from a set of e 
variables the subset that results in the best estimation of t^ 
criterion. In the present case, be, 
unrestricted, all the variables en 


after Step 3 no other variable 


cause the program was hat 
tered, but it is obvious t 5 
would be included at the - 


Step Variable Standard Coefficient Shrunken Coefficient Significance 
Error of of Multiple Value of Determi- Level 
Estimate Correlation nation 

1 structure words/nouns 1.337 466 466 217 .001 
2 letters/sentences 1.241 .582 .569 .339 .008 
3 repetitious anaphora/words 1.176 .648 .626 419 .022 
4 structure words/adjectives 1.153 .615 .645 456 wlll 
5 Dale-Chall List 769/words 1.130 701 .663 4491 .109 
6 adjectives/structure words 1.084 237 .696 .543 .044 
7 personal pronouns/words 1.067 .154 .708 .569 144 
8 class inclusions/words 1.072 .760 .105 571 412 
9 letters/syllables 1.081 .163 .699 .582 .533 
10 personal pronouns/conjunctions 1.091 .165 .692 .586 575 
11 words/sentences 1.102 .768 .685 .590 544 
12 letters/words 1.099 aam .687 .604 .296 
13 syllables/words 1.112 .780 .679 .608 .606 
14 letters/independent clauses Y iim A pes pem 
15 syllables/independent clauses y pen 4657 626 517 
16 words/ independent clauses ih 491 642 626 828 
17 syllables/sentences p" 492 626 627 889 
18 Dato-Chalt Lis 3000 wore val 192 607 621 3n 
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Table 2.—Zero-order Correlation Coefficients between Predicted 
and Obtained Mean Cloze Scores—Bormuth’s Readability Formulas 


————————————— 
Item Obtained Predicted Predicted Predicted Predicted 
Cloze Formulal  Formula2  Formula3 Formula 4 


i—i 
Obtained Cloze 1.00 


Formula 1 .359 1.00 

Formula 2 461 -753 

Formula 3 495 177 859 1.00 

Formula 4 374 852 623 680 1.00 
hunc MN aÁ 


p (r =.373) =.01 


Table 3.—Unrestricted Stepwise Multiple Correlation—Summary of Steps 


19 common noun/structure words 


level of significance. This is what happened in the restricted 


program. After Step 7 the addition of new variables not 
only does not contribute substantially to the correlation, 
but the shrunken value of the correlation (5) starts to 
diminish after this point. That means that the apparent 
increase in the correlation is simply 4 spurious result that 
disappears when the Joss of degrees of freedom Is ee into 
consideration. If the first seven steps are compared with 
Bormuth's 1966 and 1969 equations; it becomes on 
that the best set of predictors differs in each case, Out o 
the four variables included in the 1966 joel ewe 
entered in this equation; of the eight variables included m 
the 1969 equation, only three entere 


d the present equation. 
isi iables listed in 
On the other hand, of the eight promising variables listec 


Table 1, four entered the equation. Table 4 summarizes the 
results and shows also the step in which the variable entered. 

From the results presented in Table 4, it seems that when 
the full range of ability of the subjects and difficulty of 
materials is used, the best predietors are not the same as 
when a restricted range is used. This may be another reason 
for the reduction in the correlation values. 


How well is cloze mean predicted? 

Table 3 reports in column 4 the multiple correlation 
coefficient obtained in this study. If only those variables 
that reach a .05 significance level are included, the correlation 
is .648; that is, 42% of the total variance can be explained 
in terms of three variables: structural words per noun, letters 


OO 
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Table 4.— Variables Entering the Multiple Regression Equation 
Reported in 1966 and 1969 by Bormuth and Their Position in the 


Present Equation 


Yearof Variable in Position in Promising Position in 
Study Equation Present Eq. Variable Present Eq. 
1966 Letter/independent clause 14 Syllables/independent clause 15 
Nouns/strings 19 Words/independent clause 16 
Letter/words 12 Structure words/nouns 1 
Pronouns/conjunctions 10 Adjectives/structure words 6 
1969 Syllables/sentences 17 
Syllables/words 13 Anaphora/words 3 
Personal pronouns/words 7 Class inclusions analysis/ 
words 
Dale-Chall List 769/words 5 Structure words/adjectives 4 
Dale-Chall List 3000/words 18 
Words/sentences ll 
Letters/words 12 


Letters/meaningful punctu- 


ation units » 


per sentence, referential repetition anaphora per word. 
Notice, nevertheless, that this is anad hoc Correlation, that 
is, an optimum value for this particular set of Passages. In 
Study II, a cross-validation of the formula obtained in 
Study I is performed. 


Study II 


Two new sets of data, leisure-time activities and news- 
papers, from the same literacy study were used in this 
study. The passages were also 60 to 70 words long, and 10- 
word cloze tests had been developed on them. The Subjects 
were again seventh graders. The same linguistic variables 
were computed for each Passage, 

For each passage five predicted cloze means were obtained: 
one for each of Bormuth’s formulas and one for the formula 
obtained in Study Iin this report. Zero order correlation 
coefficients were computed between all predicted me 
the obtained cloze means, 

Table 5 presents the results. All correlation Coefficients 
are significant at the .0] level, except, ag 


ans and 


ain, for the results 
of Formula 1. Bormuth’s Formula 3 gives consistently 


higher estimates than the other three. Formulas 2 and 3 use 
almost the same numerical values and variables, except that 
Formula 3 uses the Dale-Chall long list of common words, 
whereas Formula 2 uses the short list. Apparently, by includ- 
ing a larger range of common words, prediction can be 
improved substantially, at least at this grade leyel, 
The formula developed in the textbook materials gives 
better cross-validation results than Bormuth’s formulas. 
This can be accounted for in three different Ways: 
L Although the materials are different in the cross- 
validation sample, the subjects come from the same 


population. Bormuth’s formula was developed not 
only fora different set of materials, but also using 
subjects from a different population. 

2. The best predictor variables under the present circum- 
stances are differe 
considered, 

3. The third possibilit 
Bormuth’s formul 


nt from when the full range is 


Y is a combination of both. Although 
as have a greater generality, the 
evidence from this study tends to suggest that for- 
mulas developed fora Specific population may give 
better results than formulas developed for a more 
general Purpose, 
Table 6 shows the intercorrelation between the esti- 
mated cloze means for the different formulas. An impor- 
lant result is the fact that the 


intercorrelations among 
Bormuth’s formu 


. . any 
las are higher than the correlation of any 
of them with the textbook formula. This again seems a r 
ji y eed P 1e! 
Suggest that under the circumstances of this study, anot 


Set of variables should be used for maximum predictior 


Summary and Conclusions 


Readability formulas that have been developed for 
general purposes are usually cross-validated in samples ol 
materials of a wide difficulty range and with subjects of 
wide ability range. In the more restricted classroom situ- 
ation or when a school system evaluates materials for a ' 
Specific grade level, the Situation is somehow different dut 
to the restricted range of both materials and ability level. " 
This study was performed to test (under the latter situa? 
the performance of four readab 


ility formulas reported by 
Bormuth in 1969. These 


2 E se 
formulas were selected becaus 
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Table 5.—Zero-order Correlation 
Cloze Mean and Predicted Cloze 
and This Project's Formulas 


Coefficients between Obtained 
Mean-Predictions Using Bormuth's 


D 


Formula 

Materials 1 2 3 4 Textbooks 
Textbooks (N = 45) 1359, 461 495 374 .647* 
Newspapers (N = 36) 454 428 470 419 549 
Recreational (N = 23) 487 461 561 525 552 
Textbook + Newspaper 
+ Recreational (N = 104) 410 448 A15 455 44951 
Newspapers * 

.502 495 526 .506 .581 


Recreational (N = 59) 


* Not cross-validation data 


1 Spurious value; i 
formula 


Table 6.—Intercorrelations 


ncludes data used for development of the 


between Means Predicted with Bormuth’s 
k Materials’ Formula 


Formulas and the Textboo! 


Bormuth’s Formulas 


Item 1 2 
T 1.00 
2 .926 1.00 
3 .925 .959 
4 .835 .819 
2:529 


3 4 Textbooks 
1.00 
.810 1.00 
468 472 1.00 


Textbooks 534 


they show an unprecedented degree of accuracy in pre- 


diction, 

Sets of materials from a literacy evaluation in Madison 
Public Schools were used as a cross-validation sample for 
is study. The subjects were seventh graders, and the j 
domains of materials were tex thooks, leisure-time reading, 
and newspapers. The results show à reduction in the valid- 
ity of the formulas; that result was expected, but a word of 
caution is necessary when considering the results. Because 
the materials and tests were not developed ad hoc for this 
study, the short length of the pass 
lower reliability of the criterion scores aS we 
linguistic variables used and/or to à curvilinear retat : 
among them. Since it is not possible to determine if the s 
reduction is due to the changed circumstances of the Su 
9r to low reliability and the effect of curvilinearity, tt is 
recommended that the results of u considerec 
85 provisional until another study is ] using longer 
Passages, 


ris study be 
performec 


The results show that a drop in the correlation should be 
expected from around .90 in Bormuth’s cross-validation 
studies (2) to around .45, which ean be considered as 
a “floor” value; that is, the validity should not drop much 
lower in subsequent samples. On the other hand, it seems 
that formulas developed in samples of subjects from the 
same population may perform better than general formulas. 
The present study also suggests that the best set of pre- 
dictors may be different under the changed circumstances 
in which it was performed. Given the theoretical and prac- 
tical foundations that Bormuth’s studies have laid for a 
readability and the more eco- 


systematic exploration of 
ter technology available today, 


nomically feasible compu 
it should be possible for large school systems to develop 
ation that reach a high degree of 


formulas for their popul 
astances of restricted ability and 


accuracy even under circum 
difficulty range. 

Finally, the possibility could be considered of developing 
formulas that are specific not only to a restricted population 
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" n ation f : currently being revised. They 
te als a univers e : that is 3. These computation formulas are currently being revise ] 
of subjects, but also to a universe of breeds b -— f may be obtained by writing Professor John Bormuth at the Univer- | 
specific formulas for seventh-grade textboo materials, for sity of Cliltaro: 
instance, or for newspapers and magazines. These further 
restrictions would improve the prediction ability of the REFERENCES 


formulas and would result in a better match between ability l. Bormuth, J. R., “Readability: A New Approach," Reading | 


of the students and the difficulty of the materials. Research Quarterly, 1, 3: 79-132, 1966. | 
2. Bormuth, J. R., Development of Readability Analyses, 
University of Chicago, 1969. | 
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1. This Project was carried out in cooperation with the Curric- Reading Literacy—An Interim Report, Madison Public Schools, | 
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and the Instructional Research Laboratory of the University of 4. Klare, G. R., The Measurement of Readability, lowa State 
Wisconsin. University Press, 1963. 
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Clasen, Associate Director of the University of Wisconsin Instruc- New York, 1969. 
tional Research Laboratory for his support and counsel during this 6. University of Wisconsin, STA TJOB, University of Wisconsin 
project. 
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TEACHER SELF-ACCEPTANCE, ACCEPTANCE OF 
OTHERS, AND PUPIL CONTROL IDEOLOGY 
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ABSTRACT 


The relationships between teacher self-acceptance, ; n 
r and 
were examined. Levels of self-acceptance and acceptance of others were measured using Berger ^ 


instrument. The Pupil Control Ideology (PCI) Form i nir, T 
as the operational definition for teacher orientations ci rv pret 276 Medie. 


responded to these instruments. Pearson product moment correlations indicated that self-acceptance 


was not related to PCI, but that high acceptance of others was i i 
:  FCL associated with h ism i 

Regression analysis indicated that acceptance of others, followed by teaching level and ee 
experience, predicted teacher PCI. Speculations on why self-acceptance was not associated wi h 

views on control were presented, DMEGAVIUL eachen 


THE CONCEPT OF SELF has long intrigued students of 
human behavior. William James, Freud, George Herbert 
Mead, and Charles Horton Cooley among others haye 
contributed to existing thought on the subject. “Self” has 
been investigated from a variety of perspectives including humanistic-custodial continuum (14), and there have been 
sion Hated Ti and self-acceptance. For a a large number of investigations of jode mri control 
review see Wylie " ideology and pupi i 5) 

Studies by Fey (2), Sheerer (13), Berger (1), and Studies Ayer sir a h qr de 
teacher predispositions and pupil control ideology report 
that low dogmatism (15), commitment to emergent 
rather than traditional values (6), low status obeisance oF 
deference (5), high creativity (4), high levels of self- 


The present inquiry examined the relationships of 
public school teachers’ levels of self-acceptance and 
acceptance of others and their pupil control ideology. 
Pupil control ideology has been conceptualized on a 


Omwakee (11) link self-acceptance and acceptance of 
others. Both self-acceptance and acceptance of others 
often are assumed to he desirable characteristics of 
teachers. 
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actualization (9), and high expressed own and wanted 
behaviors of inclusion, control, and affection as measured 
by Schutz's FIRO-B scale (8) are associated with human- 
istic teacher pupil control ideology. In addition, high 
teacher sense of power was found to be related to teacher 
m control ideology —pupil control behavior congruence 
McAndrews (10) tested two hypotheses concerning 
teacher self-esteem and pupil control ideology. One hypoth- 
n would be associated 
cology. A second hypoth- 
egatively 


esis suggested that high self-esteer 
with a humanistic pupil control id 
esis proposed that teacher self-esteem would ben 
related to conformity, defined as the congruence of self 


pupil control ideology and the perceived pupil control 
alized in terms of “the 


ideology of colleagues operation 
> Neither of these hypoth- 


typical teacher in your building.” 
eses was supported empirically. 
The present study builds on McAn 


drews’ work, and 


refines it in the sense that it tapped a somewhat different 
dimension of self and added another concept. McAndrews 
definition of self-esteem was based on the discrepancy 
between reported self and ideal self. Wylie (16) pointed 
Out that self-esteem or congruence between self and ideal 


self means being proud of one’s self, while self-acceptance 
If including one's recognize 
d by MeAndrews: 


d, in addition, 


means respecting one’s se 
faults. The former concept was employe 
this research utilized the latter concept an 
the concept of acceptance of others. —— ia 

Two hypotheses guided the investigation: (1) Teac bud 
self-acceptance will be positively related to humanism 1m 
Pupil control ideology: and (2) Teacher acceptance o 
others will be positively related to humanism in pupil 
Control ideology. 

The tonale for these hypotheses was simple and 
Straightforward. It was grounded in the notion that an 
individual who is self-accepting also is likely to accept " 
Others and exhibit a humane person-oriented stance e 
those with whom he or she interacts. In the case of m l 
school teachers this type of perso” seems likely to hold a 


humanistic pupil control ideology- 


Method 


Instruments 


In order to test the hypotheses, ope 


w self-acceptance, acceptance 
= ideology were required. 
easured by the Self-Acceptance : il Control 
GAO) Form developed by Berger ). The Pupi 
aa Sg (PCI) Form devised b x 
E served as a measure for 
or purposes of developing the 9% : terns 
of self was defined as the possessio of behavior pat ern! 
guided by internalized values; 


sense of self-worth, and an absence of shyness or self- 
consciousness. Acceptance of others was defined as behav- 
ior patterns guided by acceptance of individual differences, 
lack of dominance, service, interest in satisfactory relation- 
ships, and a belief that individuals are responsible for their 
actions (1). 

The instrument consists of 64 Likert-ty pe items with 
five response categories ranging from “true of myself” to 
not at all true of myself." The self-acceptance part of the 
scale is composed of 36 items with a possible range of 
scores of 36 to 180; the acceptance of others section con- 
tains 28 items with a 28—140 scoring range. On both scales, 
the higher the score, the more accepting the respondent. 
Reported SAAO Form matched-half reliabilities were 89 
or greater, except in one case where a corrected coefficient 
of .75 was indicated. Validity was based on essays written 
by 40 subjects, with half of the subjects writing on their 
attitudes about self and half on their attitudes about others. 
These documents were evaluated by four judges and the 
mean ratings were correlated with scale scores. The corre- 
lations between ratings and scores were .90 for self- 
acceptance and .73 for acceptance of others (1). 

The PCI Form taps educators’ views on pupil control 
ona humanistic-custodial continuum. A humanistic 
orientation toward pupil control stresses an accepting, 


ils and optimism concerning their 


trustful view of pup! 
ability to be self-disciplining. A custodial pupil control 


ideology emphasizes the maintenance of order, distrust of 
pupils, and a moralistic stance toward deviance. The 20-item 
Likert-type device uses a 5-point response scale ranging from 
ly disagree. Examples of items are, 


strongly agree to strong 
“being friendly with pupils often leads them to become too 


familiar,” and “pupils can be trusted to work together with- 
out supervision” (reverse scored). The scoring range on the 
instrument is from 20 to 100; the higher the score, the more 
custodial the ideology. Reported split-half reliabilities 

e 90. Validity studies were based on the use of the 


are abov 
dged by their principals to be 


PCI Form, with teachers ju 
custodial or humanistic (14). 


Sample 

nstruments described and an information 
sheet requesting demographic-type data were sent with a 
cover letter to the faculties of ten school building units in 
asingle school district in central Pennsylvania. A total of 
342 teachers received the packets, and 276, or 81 per cent, 
of them returned usable forms. 


The two i 


Results 


The statistical method used to examine the major hy poth 
eses was the Pearson product-mome nt correlation. The first 
hypothesis, which proposed a positive relationship between 
ideology, was rejected. Relevant data are presented in 


Table 1. 
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Table 1.—Correlation between Teachers? Self-Acceptance and 
Pupil Control Ideology 


2 2 
Variables N Mean SD P r P 


Self-acceptance 276 148.2 15.6 


Pupil Control Ideology 276 52.5 9.0 —07 .0049 NS 


Separate correlations were also calculated between these 
two variables for the subsamples of 102 elemen tary teachers, 
81 middle school teachers, and 93 high school teachers, 
None of these correlations was significant, 

The second hypothesis indicated that teacher accept- 
ance of others would be positively related to humanism in 
pupil control ideology. Although the correlation between 
the variables was only moderate, the association was a 
significant one, and the hypothesis could not be rejected, 
See Table 2 for pertinent information, 


Table 2.—Correlation Between Teachers? Acceptance of Others and 
Pupil Control Ideology 
Variables N Men SD r ê p 


Acceptance of 
Others 


Pupil Control 


Ideology 276 52.5 9.0 —28 .078 <.001 


Correlations were again computed Separately for teach- 
ers at the elementary, middle school, and high school 
levels, The correlation coefficient for elementary teach- 
ers was - .28 significant at the .0] level; for middle school 
teachers it was —.15 not significant; and for high school 
teachers it was —.35, significant beyond the -001 level. 

In addition, a multiple linear stepwise regression 
technique was used to determine the Most significant 
predictors of the dependent Variable, pupil control 
ideology. The independent variables were self-acceptance, 
acceptance of others, teacher sex, teaching level, and 
teaching experience. The first variable added to the 
regression equation was the one Which made the greatest 
improvement in "goodness of fit." The next most signifi. 
cant variable was then added until all had been considered, 
Those variables not maintaining a default tolerance of 01 
were dropped from the equation, while those not 
attaining that level did not enter the equation. The final 
regression equation contained the variables that, in com- 
bination, represented the best predictive value while 
holding the other variables constant, Guilford (3) pro- 
vides information on this procedure, 
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The results of this analysis are shown in Table 3. 
Acceptance of others, teaching level, and teaching 
experience were the most significant predictors of pupil 
control ideology, with acceptance of others being the 
single best predictor. 


Table 3.—Partial and Multiple Correlations of Predictors of Pupil 


Control Ideology 
2 2 


Variables N Partialr y R R 
Acceptance of 
Others 276 —27 -073 
Teaching Level 276 —24 .058 


Teaching Experience 276 19 036 


All Variables 276 40.16 


Some additional findings of note are the following. 
Although male and female teachers did not differ signif- 
icantly in self-acceptance, female teachers were more 
accepting of others than male teachers. The respective 
means of 113.3 and 108.7 yielded a Behrens-Fisher t value 
of 3.66 which was significant at the .001 level. Female 
teachers were also more humanistic in pupil control 
ideology than male teachers; their respective mean PCI 
Form scores were 51.2 and 54.7, and the resulting t ratio 
of 3.11 had a probability beyond the .005 level. 

Elementary teachers, with a mean PCI Form score of 
49.6, were significantly more humanistic than middle 
school teachers, who Scored 53.9, and significantly more 
humanistic than high school teachers, who scored 54.5. 
The respective t values of 3.38 and 3.85 both carried prob- 
abilities beyond the -001 level, Teaching experience also 
was associated with a more custodial pupil control 
ideology, Teachers having more than five years of 
experience exhibited a PCI Form mean score of 54.1, while 
those with five years or less experience had a mean score 


-001 level. These results support those of past investiga- 
tions of Pupil control ideology (15). . 

Finally, it was found that the correlation for the entire 
sample between teacher self-acceptance and aeceptance of 
others was .40, This is significant at the ,001 level and is 
Consistent with the results of previous studies. 


Discussion 


Our data indicate that acceptance of others—but not 


sell-acceptance— predicts teacher pupil control ideology. 
Since teaching is an occupation that occurs in a setting 


with orientations toward pupil control, However, it was 
not expected that self-acceptance would be unrelated to 
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these orientations even though our result fits McAndrews” 
findings (10) on teacher self-esteem and pupil control 
ideology. 


Several speculations are suggested. The concept of 
self-acceptance appears to gain at least some of its theoret- 
ical utility from the focus on pathology so often found in 
work in social and clinical psychology. It may be that self- 
acceptance predicts well for those at the extremes of a self- 
acceptance continuum. It is also possible that those at the 
lower extreme of this continuum are eliminated at some 
point in the process of teacher selection or soon drop out 
of teaching in favor of some other kind of work. This is 
consistent with the fact that the teachers in the present 
sample had a higher mean level of self-acceptance than any 
of the groups reported on by Berger (1). 


Another conjecture is that self-acceptance simply may 
not be as significant as has been believed in influencing 
job-related attitudes, especially when the socialization 
process functions effectively. The fact that, in addition to 
acceptance of others, teaching level and teaching experience 
Were predictors of pupil control ideology; tends to buttress 
this contention. In contrast to self-acceptance, acceptance 
of others is associated with teacher pupil control orienta- 
tions, This suggests that acceptance of others, as compared 
With self-acceptance, is less constrained by organizational 
and other social factors, and is the kind of persona 
quality that can find legitimate ex pre 1 the school’s 
Social setting. 


ssion ir 


:eelaimers i ection 
The authors make the usual disclaimers in conne cti 


With this research. In particular, it should be borne in mind 

that the teacher sample came from a single school district. 
evertheless, it is believed that a number of pect 

questions have been explored poth in this inquiry an 
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this speculative analysis. 
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ABSTRACT 


LEARNING IMPAIRMENT can be caused either by 
physiological or by environmental factors. It is of great 
importance to know if the learning deficiency in a person is 
based on a functional impairment of the brain in those areas 
which are operative in information acquisition and recall, or 
if it is a result of parental, sociological, pupil-teacher rela- 
tionship or other personality interactions, or drug effect. 

If such distinction can be made easily, fast, and inexpen- 
sively, then a breakthrough will have been achieved by 
placing every child in an optimal level of the educational 
system for maximum learning experience. And, to comply 
with the law requiring the psychological screening of all 
school children, the proposed system will solve the difficul- 
ties by verifying the normalcy of the majority of the stu- 
dents, who, therefore, will not require any further diagnos- 
tic testing. By eliminating 80 to 85% of the testees, the 
funds and professional personnel may then be adequate to 
cope with the remaining workload. Thus, the professional 
personnel can devote intensive attention to the remedial 
management of those children who have learning handicaps. 
The information obtained with the Synchrocephalograph 
(SCG) is of invaluable assistance to classroom teachers, psy- 
chometrists, and parents in understanding the impairment 
of learning-deficient children. 

The human brain capability for learning can now be mea- 
sured with objectivity and reliability using the SCG. This 
technique does not permit a direct numerical correlation 
with the generally used IQ test. The reason for this is well 
understood and is related to the fact that the requirements 
for the measurements of the intelligence quotient are not in 
direct function with learning capability. 


Use of the SCG advantageously allows for an objective 
measurement of the cognitive capability and related psycho 
physical factors of the brain; therefore, the results of this 
measurement provide a guidance for efficient teaching and 
training and most useful information for individual teaching 
behavior management, and perceptual motor development. 
Thus, one of the major advantages of the use of the SCG i5 
that it prevents mislabeling and misplacing a child in the 
educational delivery System, and proves that learning Ys" 
function can be teaching dysfunction. 

Other applications are, for example, assistance in various 
manpower training programs, such as CETA, and also as 2 
tool to approximate the objective, functional human age: 
thereby removing the discrepancies and injustices caused 
chronological age limits in various occupations. 


Theoretical Considerations 
Neural Efficiency and Intelligence 


Neither the mechanism nor the cybernetic interaction? 
which generate these Systems output of the brain, is cleat V 
understood. It is certain that these electrical activities: MN) 
the neural information transfer rate, or neural efficiency Cat 
and the hemispheric synchronization, or time-delay (TD); 
independently related to general intelligence. 

The electrical activity of the brain, detectable on the 
surface of the head, represents the statistical behavior of 
the neurons as they process information. This activity i" 
the brain is analog and digital, but due to volume condu“ 
tion and other technical factors, only analog signals can 
be measured on the surface of the head. The interaction ° 
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many signals of different intensities and phase relations 
results in a particular spectrum of the composite signal. 
Individual differences are small in the alpha-band (8-12 Hz): 
however, at higher frequencies they are substantial. A large 
amplitude at a given frequency means that many cell as- 
semblies, to use llebb’s terminology, are synchronized and 
active at that instance. Such sy nchronization can occur ran- 
ary concomitant of the information 


domly, or as a nec 
processing program. 

The concept of neural efficiency, which presently is 
restricted to time-domain analysis only, is based on the fol- 
lowing testable hypotheses, as developed by Ertl: 

General Hypothesis: The efficiency of information pro- 
cessing in the brain is related to the electrical signals re- 
quired and used by the system. 

Specific Hypothesis: The average frequency of non 
activity is related to information processing efficiency. 

A great deal of effort has been spent on the IQ concept. 
After sixty years of work, thousands of articles, and millions 
of IQ tests, it is now generally agreed that we do not know 
exactly what human intelligence is, OF how to measure it 
accurately. IQ test scores are not culture-free or even culture- 
fair, and the test scores do not suecessfully measure the 
potential to learn, but only what has already been learned. 
IQ test scores are also poor predictors of job success or 
academic achievement. There is some correlation between 
NE and IQ test scores at the extremes, but even less in the 
middle range. The results simply indicate that both methods 
overlap and measure some aspect of intelligence. 

The brain is anatomically divided into two halves, and 
communication exists between the hemispheres. It is there- 
fore reasonable to assume that the synchronization of this 
communication process may be an important variable in re- 
lation to intelligence. The SCG is designed to measure the 
degree of cross-correlation between the EEG derived from 
the right and left hemispheres. 

Subjects with learning disabili à 
large time-difference scores, but sometime 
Scores, The NE variable and the symmetry 
tentatively be regarded as the output efficie 
tion management of the brain. 


-alpha 


ties generally have relatively 
s normal NE 
variable may 

ncy of informa- 


IQ Testing 


Learning ability and efficient 
a to intelligence. Therefore, 
i Ra should scale the perceptual and! 
an an individual, the widely used intel 

enormously important role in scaling YOUnE 
Mir application of the usual, $07? ys 
Ow ae Quotient Test not only 
idelity due to inherent systematic err 


estee a : ye nudity SU i 
Sen yaten dynamo, pn — E ing to groups with 
-e ement if applied to individuals bel g d. In gen- 


he rg cultural and socio-economic 
> the scaling of children with respe 


ability is, in spite of the scientific efforts of standardization, 
in a most objectionable state. 

If one complicates matters by asking for comparative 
values in learning potential of various social or ethnic sub- 
sets, such as black and white high school students of various 
sex and ages, one arrives at very controversial and even in- 
flamatory issues which, so far, have not received objective, 
unbiased, and reliable research effort. Thus, urgent research 
is needed to: (1) Improve the scaling of individual children 
when statistically determined standards are applied; and 
(2) Test and apply hardware which can measure in an ob- 
jective, repeatable, unbiased, fast, and inexpensive manner 
the learning potential of children and adolescents. Hereby 
yarious predetermined parameters such as age, Sex, ethnic 
origin, various psy chological test results, etc., would be 
given statistically significant sample space. Such research 
results could be subjected to a correlation and cross- 
correlation study, which would give the categorization of 


the population a more solid basis. 


Further Limitations of Standardized Testing 


The establishment of the degree of capability of a child 
to learn and mentally mature in function of its chronolog- 
ical age is a widely practiced occupation among school psy- 
Almost daily, new tests are added to the already 
voluminous battery. Considerable effort and funds are 
spent Lo determine the sensitivity, repeatability, and error 
margin of these tests when applied to a very heterogeneous 


chologists. 


population. 
The inherent variability of our child population makes 


any standardization most difficult to achieve even in the 
objective tests, not to mention the more subjective tests, 
needed to evaluate emotional parameters. The perturbance 
caused by the variable influence of the testor on the testee’s 
score is often overlooked, and in many cases the adminis- 
tration of a test battery to place a child precisely in a given 
maturity group or to determine learning deficiency is not 
avery reliable activity at the present time. Nevertheless, 

it is most important to place children into compatible 
groups to prevent performance decrement when exposed 
to heterogeneous, highly competitive groups and to uni- 
form instruction. This, in turn, necessitates analytical 


evaluation through testing. 


Improving Evaluative Ability 

If one is confronted with a large number of test scores, 
the validity of each individual test in the sample space is 
less significant than the overall predictability of the 
statistically assessed values. Although the individual tests 
have, in most cases, à well-known error range, which might 
be quite significant, the aid of large numbers in the sample 
space will smooth out extreme values, thus rendering the 
test’s predictability statistically acceptable. If, however, the 
score bears great importance on the 


outcome of the test 
future of an individual child, then the broad limits of the 
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20 
scattered scores cannot be overlooked. To improve the = 0.19/0.35 
evaluation of individual scores, which are compared to - 0543 


statistical averages, one can apply the Bayes theorem 


RE as This means that only 54.3% of the high scores are really the 
which is based on the principles of conditional and total 2 z 


high scores. And, for p(X.. Y): 


probabilities. , 

The definition of conditional probability, namely, the AX,/Y) = 0.10 - 0.60 
probability of X occurring if Y has been already decided, can 3 0.95-0.20 + 0.50-0.20 + 0.10- 0.60 
be expressed by: = 0.06/.035 


p(XIY) = "mm m = 0.171 
JA 
R which indicates that from those who get high scores, 17.1% 


whereby p(X) # 0, and X and Y are score values in this are failures, and consequently, for p(X,,/Y): 


case. 
Venn-Diagrams are a good way to illustrate XY, and with 0.550 - 0.60 
such methods an adaptive algorithm can be developed by P(XIY) = 0.95. hae 0.50-0.20 + 0.10-0.60 


which one can give higher fidelity to the solutions in dy- 
namic programing. This is needed for such models as the 0.1/0.35 
prediction of the validity of test scores; however, this = 0.286 

method corresponds with considerable difficulties in the 
numerical and mathematical manipulation. Utilizing the 


Lii 


which means that 28.6% of the high scores are only mar- 
ginal. 


Bayes theorem, let xX, X, , through X, be the number of As far as the failures are concerned, the conditional 
ie Pita cera or events probabilities are: 
a t a 1 p » a t t "at 
whereby the proba s v al j events mus è grea er p(YJX,) “ p(X) 
than zero, and X should cover the total event space. The iX, Y) = i 
total probability of an incidental event, where Y = X,Y+ $ p( Y/X,) “P(X, ) * p( Y/X,) | p(X,) * pl Y/X,)T V^ 
X, E X Y, is: 
n n 0.50 - 0.20 
AY)= E * p(X\Y)= z , POX) X) (2) = 0.05-020 0.50 0.20 + 0.90 - 0.66 
iz i= 
= 0.01/0.65 
Assuming this statement, and utilizing Equation (1), one 0.015 
.015 


can obtain the probability of the event density X (91,9. 
an) through the Bayes theorem, provided that the event Y 
has already occurred: 
X,/Y) = 0.50 (0.20/0.65 
f _ P(YIX) . p(x) (3) P(X,/Y) = 0.50 (0.20/0 65) 
PY) = pX Yyp(y) = PLAY -eO sos 
Eo PCIX) p(X.) 
P(X,/Y)= 0.90 (0.60/0.65) 


If we apply this to the prediction of the validity of test = 0.831 


scores, then we must have some statistical prerequisites, 
For instance, let us assume that from a statistically signifi- 
cant sample space in a given test, 20% of the population 
score high (X, ), 20% score marginal (X,), and 6% fail the 
test (X,). The sensitivity of test is such that 90% of the 
failures (Y) are recognized as belonging into this class and 


According to this, from those getting a failing score, % 
only 83.1% are really failing, 15.4% are marginal, and 1.5% 


are high scorers, 


95% of the high scores are recognized as such. In the case As indicated, then, the validity of individual test scores 
of marginal scores, 50% are considered “passing” and should be based on the considerations described above- 
50% as “not passing.” The problem is to find the probabil- It can be proven mathematically that, by definition, one 
ity of a given high scorer to be really a high scorer, With the Cannot expect absolute values by test scores, and their 
Bayes theorem, we find that: fidelity lays on an assymptote. All that one can do is to 
optimize the point of practical termination of the curve, 
P(YIX,) - p(X,) which is determined by the utility of the test as well as 


= : rd i h 
P(X,/Y) p YA - p(X) + pl Y/X, eX) " PO) PX) by environmental factors and conditions which are broug 


095 + 020 System "Lestee-testor-test. " 
= A : Naturally, by the analytical considerations of test scor ^^ 
0.95-0.20 + 0.50 0.20 + 0.10-0.60 the obtained information lends itself well to treatment © 
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measurement if one can reduce the scores to the elementary 
units of information. This would enhance the comparability 
and, thus, the fidelity of score values, even more. If we de- 
note the score as source of information (x), then the mean 
value H(x) is: 
k 
H(x)= Z P, (-log, P) (4) 
izi 
A(x) is expressed in bits per score, of the mean logarith- 
mic probability for all scores from a given test. 

Of course, one can also make good use of human judg- 
ment by assessing hard-to-quantize test values, provided one 
takes the laws of comparative judgment into careful con- 
sideration. This can be stated for discriminal differences as: 


S,—S, =X o? + o, — 2r0,0, (5) 


1,2 1 
Hereby, S, and S, are the score values of two compared 
tests; whereby, 
= the sigma value, representing the proportion of 


o 
. If this value is greater than 


19 
judgment P, 3 à ni 
0.5, the numerical value of X; 5 is positive: 
otherwise it is negative. 
0. = discriminal dispersion of the information of 
score S} - i 
o, = discriminal dispersion of the information of 
score S,. 
r = correlation betw 
S, and S, in the same judgm 


een the discriminal deviation of 


ent. 


xperimental-analy tical 


This F , ack 
his type of approach is valid for e 
is involved, and 


Work where Weber's law or Fechner’s law 
ir k 7 P 
! most other educational scaling. 


Testing by Electroencephalography 


s for the value a 
ilable to measure 


ural ef- 


: Beside these consideration ssessment of 
individual test scores, hardware is NOW ava 
and provide numerical readout concerning the nel i 
ficiency—that is, the learning capability—of individuals. 

A great deal of research has been done to relate various 
Parameters of the electrical activity of the human nd 
UN chological variables. A comprehe description v 
pie the success and failures in this area are ee 
^r. Charles Shagass in his book, Evoked Potentials in PSY 


chiatry (25). 
A fey : «sin gene 
; w remarks about brain waves I gen Wave Analy- 
“ented here, as they are relevant to the Brain W gone ca 
er (B - os are most fred" 2 
( WA). Although alpha waves are mo pr frontal 


Oceipi. à : 
lil Pital region, they do occur c8 cial at 
er es with frequencies between 8 and 14 H2 = beta waves 
sect should be given to the somewhat slowe derit 
Pant ʻo as descriDU 
e significance in brain wave g 
w. Beta waves can be subdivided into bet a 
Waves, e ent characteristics- 


nsive 


ral may be in- 


ach having differ 


hibited by increased brain activity, such as learning, while 
beta II waves are excited. Theta waves are also in the fre- 
quency range of the analyzer's operation, and occur during 
tension and frustration in the parietal region of the brain 
of children. 

Regarding the “white noise "-type electrical activation 
potential, which emanates from the millions of active 
neurons in the brain, one would expect a less coherent and 
structured encephalogram. Thus, there must exist some sort 
ofa synchronizer mechanism in the brain, the nature and 
location of which are unknown, which is manifest in the 
encephalogram. This hypothesis is strengthened by the fact 
that increased cerebration decreases the intensitites of most 
brain waves. Thus, the presence of a coordinating mechan- 
ism, governing inhibition and enhancement of the overall 
operation, and also its responding to certain priorities, is 

robable. This could account for some signatures in the 
EEG, although they are still too complex to permit, at 
present, fine-structure interpretation. 

The two brain hemispheres are subdivided by the lateral 
fissure, which is covered by the extension of the dura. There 
is, however, a connection between the two hemispheres 
«through the corpus callosum, which is probably the path of 
information between the hemispheres. In normal persons 
one side of the brain is dominant over the other. In more 
than 90% of the population the left side became dominant, 
because, for some reason, the angular gyrus region in the left 
half of the brain was used sooner and more frequently for 
the side of the brain which gained 
the first start increases rapidly in potential, while the other 
side remains slight. However, the interpretative and many 
motor areas, although highly developed on one side, need the 


information from both sides for proper functioning. For 
tain, very restricted time- 


learning experience. Thus, 


such coherent systems output a cer 
delay in the information transfer between the two hemi- 


spheres is mandatory. The optimal time interval is about 3 
to 10 milliseconds; faster or slower-time-delay has deleter- 
ious effect on cognitive functions. The BWA measures this 


time-delay with great accuracy. 


The Synchrocephalograph 


The systematic changes induced by sensory stimulation 
in the pattern of brain activity are known as evoked poten- 
tials or evoked responses, and it has gradually become evident 
voked responses are very sensitive indicators 
d physiological states and changes in 
man. The study of evoked responses with the aid of mod- 
ern computer technology has opened a small but significant 
window to the brain. Based on the work of Dr. J. Ertl and 
others, it appears that time-domain analysis is a use ful 
approach tcward the study of the efficiency of any informa- 
tion processing system, biological or electronic. Relation- 
ships have been reported (1 3, 17, 26) between psy chometric 
lligence and certain time parameters of the 


voked responses. 


that these € 
of psy chological an 


tests of inte 
human visual e 
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There is considerable evidence that the late components 
of the visual evoked response are sensitive to changes in 
stimulus parameters involving decision making, pattern 
recognition, attention, and problem solving (1, 4, 6, 28). 

In general, there is clear and generally accepted evidence 
that parameters of the evoked response are related to higher 
levels of information processing in the brain (14). 

In normal subjects the electrical activity of the two hem- 
ispheres of the brain is highly synchronized. In persons suf- 
fering from primary learning disabilities, the Sychronization, 
however, is very poor. In most of these subjects the left- 
right differences are more than twice às great as in normal 
subjects. - 

Therefore, there has been developed a simple, easy-to-use, 
and inexpensive system to measure neural efficiency and 
brain dysfunctions. Considerable computer simulation and 
field testing have been done with this so-called Sychroceph- 
alograph, a neural efficiency analyzer, which was designed 
to perform two functions: 

l. To measure the rate of information transfer within the 

brain through the analysis of brain wave activity. 

The analysis yields the neural efficiency score which 

can be related to factors of intelligenc zand 

To measure the symmetry of the electrical activity 

from the two hemispheres of the brain. The degree 

of symmetry is related to learning disabilities and 

other brain dysfunctions. 

Left-right difference scores are displayed by the analyzer 
for two major components of the evoked response. These 


to 


learning difference scores will be also useful as clinical in. 
dicators in the study of learning disabilities. The results ob- 
tained with this instrument should be treated in the same 
careful, professional manner as the results of a medical 
examination or a psychiatric or psychological examination, 
Sometimes one obtains large asymmetry scores for many rea. 
sons besides learning disabilities. Whenever large differences 
are observed, medical, preferably neurological, examination 
is recommended to the patient. Based on the evidence avail- 
able to date, the automatic Synchrocephalograph becomes a 
useful tool in assessing the basic neurological efficiency of 
the human brain, and also in the early diagnosis of brain 
dysfunction. 

The technical description of the instrument is omitted 
here. However, it should be mentioned here that for max- 
imum subject safety, the equipment is battery operated, No 
photic or other stimuli for evoked potentials is necessary, 
because the environment provides enough stimuli during the 
test. The testor must keep the subject relaxed because mus- 
cular tension would cause myoelectric interference with the 
EEG. The subject must not be given any lask, such as read- 
ing, mental calculations, ete. The oscilloscope is provided 
to monitor an artifact-free brain wave pattern during the 
test, and “unclean” measurements should be discarded from 
the computation. The system is easy to operate, requiring 
only a few hours of training. Since brain waves are the basic 
data input it is essential that the operator be able to recognize 
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artifacts due to improper electrode application or excessive 
tension on the part of the subject. This skill can be learned 
in a few days. The average test is completed within four 
minutes, which does not include the few minutes of compu- 
tation and evaluation. 


Validity, Reliability and the Error of Measurement 


For all tests of human ability, both psy chological and 
physiological, information about validity, reliability, 
and the error of measurement are important, There are 
some well-established rules to describe the validity of psy- 
chometric tests of intelligence. Criterion validity, i.e., a 
direct and independent measure of that which the test is 
designed to predict, is probably the most powerful dem- 
onstration of validity. From this point of view, the validity 
of the neural efficiency test compares very favorably with 
another major criterion which is based on the assumption 
that intelligence increases with age up to maturity. There 
are clear-cut developmental changes in neural efficiency. 
Correlation with other tests is a widely used standard of 
validity ; however, the logic behind this method of evalu- 
ating validity is a little dubious since it assumes that another 
test of intelligence already has proven validity. 


Impaired Levels of Consciousness 
At any rate, it has been proven that in cases of impaired 
consciousness the neural efficie 


ncy score changes in the 
predicted direction, i. e. 


] longer latencies are associated 
with impaired levels of consciousness, This is substantiated 
by the various observations concerning evoked responses 
and impaired levels of consciousness, examples of which 
follow: 
i, Sleep: The latencies of the late components of the visual 
evoked response are increased during sleep (19, 14). 

- Experimental Delirium: Ditran, an anti-cholinergic get 
produces disruption of Cognitive functioning; late com- 
ponents of the visual evoked response are increased in 
latency (3). , 

3. In arteriosclerotic brain syndromes, the late componens, 

of the visual evoked response are markedly prolonged A 

4. Prolonged latencies of the visual evoked response in 

post-traumatic coma were found (4); the same observa 

lions were made in comatose children (13). j 

5. Effects of pharmacologic agents which are known to T 

duce levels of consciousness: A large number of prean- 
esthetics were studied by Corssen and Domino 

(7, 9). They increase the latency of the components of 
the visual evoked response. The average latencies of all jd 
components of the visual evoked response of hy pothyr* 


ati : 0 
patients are longer than the latencies of a similar em 
nt: 


N 


group. When thyroid hormone was given to the patit Pak 
the latency differences disappeared (22). Similar studic? 
were done with animals with the same results (16). 
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Age Differentiation 

The relationship between age and various parameters of 
the evoked response has been extensively studied (1,2, It 
18). “Latency appears to be the evoked response character- 
istic that is best correlated with age- Latencies are longest. 
during carly life, shortened progressively as development 
proceeds, are minimal during young adulthood, and length- 
ened again during old age” (4). (See Figure 1.) 


Correlations with Other Tests 
een the latency of various 


Published correlations betw 
d a number 


components of the visual evoked response an 
of well-established psychometric tests of intelligence range 
between 0.2 and 0.8 (5, 14, 26). In the majority of these 
studies, latencies were determined by visual inspection 
which introduced a considerable subjective element, with 
the inevitable human error, into the measurements involved. 
However, the Synchrocephalograph does not use any of 

the principles described in published reports for the meas- 
urement of the latencies of the components of the visual 
evoked response. No human interpretation is required, and 
the scores are therefore objective- Using the Synchrocepha- 
lograph, correlations of approximately 0.5 were obtained in 
à sample of 150 subjects. Research data are continuously 
flowing in from users of the instrument, and a comprehen- 


Sive report will be available shortly. 


Measurement Error 


à js less 
Due to three-digit readout, the instrumental error is 4 
than F 9.5%. The long-term instrument stability is abou 


one count in 10° There is, of course, an inevitable meas- 
urement error, due to psychological and physiological 
causes, because the brain is not a machine and therefore it 
can not match the stability or repeatability of electronic 
equipment. Average variations from one measurement to 
the next may be of the order of 2%. 
Reliability of the Analyzer 

The short-term test-retest correlations range from 0.87 
to 0.97 (2). Test-retest reliability coefficients one week 
apart are approximately 0.88. Long-term reliability of one 
year test-retest is 0.75. 

The lower long-term reliability is probably due to matur- 
ational changes in the subjects tested. It must be noted 
that these reliability coefficients were obtained by cross- 
correlating test-retest evoked response patterns, and there- 
fore they reflect overall stability of measures of all aspects 
of intelligence. Furthermore, it measures only one factor of 
intelligence, and this factor is also partially measured by 
conventional IQ tests. The neural efficiency method has the 
advantage that it is insensitive to language ability and is 
relatively uninfluenced by socio-economic and cultural 
factors which plague conventional IQ tests. It is probable 
that a multitude of environmental factors such as nutrition, 
early cultural enrichment, etc., have some effect on the 


neural efficiency score. 


Evaluation and Conclusions 
e test scores requires a simple com- 


The evaluation of tl 
utation based on a mathematical formula developed by 


Ertl (14). From the available statistically significant data 


—Age Correction Factor 
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certain broad-band norms were developed. The norms 
for subject evaluation are shown in Figure 2. 


SCORING VALUES 
BWA-E-03 


Neural efficiency = F — alpha 
1 — alpha x age 


14 or less = below average 
14-15 = borderline 
15-18 7 average 
18-21 7 above average 
21+ = exceptional 


Age correction factors: 


7 or less 7 .142 
7-9 5.:135 
9-11 7 .128 
11-14 7 .120 
14-20 = .110 
20 - 25 7 .100 
254 = .095 
J 
X phase 


T 
Time difference = 


0-10 


10-12 7 borderline 
12-15 7 below normal 
15+ = problem area 


phase score 
=—————— degrees 
5.56 


Phase degrees 


0-54 = normal 
54-65 7 borderline 
65 + = abnormal 


Figure 2.—Norms for Subject Evaluation 


ERAGE AVERAGE L ABOVE AVERAGE [exceprronar 
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Figure 3, representing the scoring curve for neural 
efficiency, and Figure 4 for time-delay are quantized only 
indicates 


with approximation, and the width of the cury 
the inherent uncertainty. But for the task for which the 
equipment is presently recommended, namely, to distin- 
guish fast and inexpensively between normal or above 
normal and marginal or learning-handicap ped persons, à 
greater sophistication would not serve the purpose. The 
necessary age correction factor is shown in Figure 1. 

The evaluation is restricted to a three-by-three matrix 
composed of “Brain Efficiency” with the symbolic 
values: Above Average, Average, Below Average; and 
“Time Difference” with Above Average, Average, Below 
Average. If either of the two parameters has a Below 
Average value, the subject needs special professional at- 
tention. It has been proven that as long as the brain ef- 
ficiency is Average or Above Average, the time difference 
Below Average can easily be remediated in the classroom 
by removing time stress from the student. This informa- 
tion is one of the advantages of the SCG. 

Unfortunately, the differential diagnosis provided by 
the SCG does not “explain” symptoms, such as dyslexia or 
dyscalculia in a child, or eliminate the ambiguity in the 
term “minimal brain dysfunction” to the degree which 
would be helpful in remedial prognosis. LEE 

There are also some highly interesting fringe-areas in UF 
field needing scientific proof and further intensive research- 
Although there is no conclusive evidence, it is thought that 
the applicability of the Synchrocephalograph, possibly i 
with multi-sensory feedback attachment, could be extended 
during long training to ameliorate the time delay or hemis- 
pheric asynchroniety ; thus, certain learning deficiencies 
could be “cured.” Furthermore, the SCG could be applied 


Figure 3,— 


Neural Efficiency Scoring Curve 
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Figure 4.-Time Delay Scoring Curve 


drug abuse letection and rehabilitation monitoring, 
ection of epileptoid and schizoid tenden- 
nents, the SCG can be 


neantime for the proven purposes 


cessfully used in the n 
ndicated in this paper. 
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A FACTOR ANALYTIC COMPARISON OF 


FACULTY AND STUDENTS’ 
OF STUDENTS 


ERNEST T. PASCARELLA 
Syracuse University 


CONSIDERABLE RESEARCH has focused on the dif- 
ferences and similarities between faculty and student per- 
ceptions of such variables as: teaching effectiveness (7); 
institutional decision-making and governance (4, 13); edu- 
cational values and goals (5, 10, 14); student characteristics 
(3, 18); and institutional climate or functioning (1, 8, 17). 
The general findings of this research indicate significant 
group differences between the two constituencies on an 
extensive array of instruments purporting to measure such 
phenomena. Little has been done, however, to assess the 
degree of congruence or dissonance between the perceptual 
frameworks within which faculty and students view signifi- 
cant educational variables, 

Implicit in the writing of Becker (2), Martin (14), and x 
Tussman (19) is the suggestion that much of the cultural 
separation between faculty and students is due to funda. 
mental differences in the dimensions along which the two 
groups structure values, goals, and perspectives. This paper 
reports the results of a factor analytic study of faculty per- 
ceptions of a specific student group, senior Arts and 
Sciences students, and those students? perceptions of their 
class peers. The purpose of the investigation was lo deter- 
mine the degree of congruence in the factors or dimensions 
along which both groups judge student characteristics, 


PERCEPTIONS 


Methodology 
Sample 


The institutions sampled in the Study were two large 
private universities located in Central New York State with 
total undergraduate enrollments exceeding 10,000 students. 
A 20% student sample consisted of 410 senior students 
enrolled in the College of Arts and Sciences at both institu- 
tions. At Institution A the specific Arts and Sciences popu- 
lation from which the sample was drawn was 1250 seniors 
(52.4% female, 47.6% male). The corresponding population 
at Institution B was 800 seniors (33.1% female, 66.9% d 
male). A 30% faculty sample was drawn simultaneously ws 
randomly from Arts and Sciences faculty and the full-time 
equivalent of graduate teaching assistants at both institu- 
lions. The total faculty samples for Institutions A and B 
were 168 and 138, respectively, 


Instrument 

As a measure of their perceptions of senior students, . 
Seven-point semantic differential scales (15) were employ: 
by both samples to rate the concept “Senior Students x 
the College of Arts and Sciences” against 26 bipolar adjec- n 
tive pairs. The pairs selected for use in the study were draw 
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largely from two sources. The primary source was Osgood, 
Suci, and Tainenbaum’s “thesaurus study” (15). The 
“thesaurus study” empirically arrays the factorial composi- 
tion of an extensive number of bipolar pairs and related 
scales. Scales drawn from this source theoretically tapped 
“evaluation,” “potency,” “activity,” and “stability” con- 
structs. The second source was Pervin's Instrument for 

the Transactional Analysis of Personality and Environment 
(16). Scales from this source were: competitive/cooperative ; 
pragmatic/artistic; creative/uncreative; intellectual/non- 
intellectual; tolerant/intolerant; and flexible/rigid. In addi- 
tion, a number of scales deemed particularly relevant to the 
concept rated, but of unknown factorial composition, were 
also included (e.g., intimate/remote; unstructured/struc- 
tured: supportive/frustrating: sensitive/indifferent). Such 

a procedure has been suggested as appropriate by Osgood, 
Suci, and Tannenbaum (15), provided the factor structure 
is empirically determined and scales of known factorial 


composition are also used. 


Response 

The instrument was distributed to the total sample at 
the beginning of the spring 1973 semester. The size and 
percentage of useable responses for both the student and 
sles are shown in Table 1. Chi-square “good- 


faculty samy go 
significance of differences 


ness of fit tests" to determine the 


noted between sample and population characteristics were 
carried out on the sex and academic major variables for 
students, and sex, broad area of discipline, and academic 
rank for faculty. The only significant chi-squares (p< .05) 
were achieved on the variable sex for faculty. At both 
institutions women were slightly over-represented in the 
faculty sample. With this singular limitation, the samples 
at both institutions appeared to be representative of the 
populations from which they were drawn. 


Statistical Analysis 


Analysis of the data began with an extraction of the 
principal components of meaning underlying faculty and 
students? respective ratings of the concepts on the 26 
bipolar pair scales. Following Kaiser’s (11) varimax crite- 
rion, components with eigenvalues X 1.0 were extracted 
and subjected to orthogonal varimax rotation. (The 
rotated components will hereafter be referred to as fac- 
lors.) Separate factor structures were computed for the 
combined faculty sample and for the combined student 
sample. 

“Program Relate” (20) was employed to obtain a gen- 
eral indication of the similarity between faculty and stu- 
dent factor structures. “Program Relate” permits the com- 
r structures from two independent sample 


parison of facto 
ng one structure fixed and rotating the 


groups by holdir 


e and Percentage of Student and Faculty Response 


Table 1.—Sample Siz 


INSTITUTION A 


SAMPLE RESPONSE X SAMPLE 


TOTAL BOTH INSTITUTIONS 
RESPONSE % 


INSTITUTION B 


ALE 119 46 39% 107 63 59x 226 109 i 
MAL E 131 75 575 53 45 Een 184 120 ess 
"S TOTAL 250 121 48% 160 108 % : 

L 81 35 43% 
PROFESSORS 4l 21 51% 40 14 35% 

3 54% 64 37 58% 
PS SOLTANE 36 22 61% 28 15 
PROFESSORS 
1 4 61% 56 36 sag 

ASSISTANT 33 22 67% 23 1 D 
PROFESSORS 

5 15 13 87% 
LECTURER/ 7 6 86% 8 7 88x 
INSTRUCTOR 

g 90 36 40% 
det 51 21 415 39 15 38% 
EACHING 
ASSISTANTS? 

4 306 157 51% 
TOTALS. 92 55% 138 65 47% 


TOTALS 168 


a - full-time equivalent 


EEEE 
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Table 2.—Matrix of Scale Intercorrelations for Faculty Responses? 
eee 
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** B4 57 
* 61 
** 


" B ud 
OST2ZS,9oouousuo 


nR 


DECIMAL POINTS 


20 21 22 23 24 25 
40 60 -26 -10 60 52 
51 65 -20 -08 67 57 
61 65 -32 .22 67 65 
55 64 -23 -02 64 63 
48 63 -18 -14 84 48 
55 67 -23 -21 82 52 
37 44 -27 -08 63 35 
60 58 -28 -19 68 65 
55 61 -34 -19 61 65 
53 64 -23  .14 58 61 
56 68 -25 -07 67 57 
59 44 -25 -20 51 49 
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Table 4.— Varimax Rotated Factor Loadings Derived from Faculty Semantic Differential Rültings of Seniors (N = 157) 


FACTORS 

VARIABLE I II III Iv " n? 

strong/weak (6) .851 -.132 5 E 
intellectual/non-intellectual (24) .850 -.061 ie be =i "H6 
deep/shallow (5) dui Des Diss 1004  -088 1748 
stimulating/dull (11) 1807 —-.254 — -.067 130 — -.019 1738 
good/bad (1) s 1804 — -.140 — .152 1038 -1119  :705 
candid/deceitful (3) .758 -.366 -.032 .154 -197 un 
progressive/regressive (2) .748 -.206 -.038 .347 -088 "732 
active/passive (8) .747 -.331 -.047 .159 -.147 ‘717 
potent/ impotent (4) .746 -.275 -.211 .256 067 "747 
excitable/calm (9) .740 -.310 .001 .045 -1264 1215 
complex/simple (10) 128 -1277 -1015 1204 — -.103 660 
systematic/disorganized (7) 719 — 1:054 -1273 -1048 —-.102 —— 610 
creative/uncreative (21) ‘701 0-234 — -.201 1278 — -.147 .642 
intimate/remote (17) 1680 — -.213 1249 jo — -l019 1581 
supportive/frustrating (26) .645 -.193 -.041 .396 1157 1635 
sensitive/indifferent (16) .570 -.446 -.172 .306 1055 1651 
accepting/critical (19) .035  -.762 :0017 — -.042 -024 1585 
tolerant/intolerant (20) .436  -.668 -.245 .128 — -.182 1746 
flexible/rigid (18) -498 -.661 -.083 .128 -.100 -717 
unstructured/structured (25) .508 -.588 -.002 3313 -.187 .736 
impulsive/restrained (14) .268 ^ -.065 .762 .142 .087 .684 
reliable/unreliable (13) .495 -.224  -.667 ‘091 — -.049 1750 
stable/unstable (12) 1503  -.286  -.655 .070 — -.109 .780 
idealistic/practical (15) .031 -.083 .090 .861 -.110 .769 
competitive/cooperative (23) -.011 .282 .130 .103 .798 .744 
-.226  -.153 1090 — -.399 .734 1780 


pragmatic/artistic (22) 
a 1.01 1.26 


EIGENVALUES 12.80 1.83 1.59 

PERCENT TOTAL VARIANCE 39.67 11.71 7.26 6.61 5.91 

CUMULATIVE TOTAL VARIANCE 39.67 51.38 58.64 65.25 71.16 

PERCENT COMMON VARIANCE 55.74 16.46 10.20 9.29 8.31 
55.74 72.20 82.40 91.69 100.00 


CUMULATIVE COMMON VARIANCE 


NOTE: The number in parentheses after each scale 


indicates the number of that 


scale in the matrix of intercorrelations. 


ure on it until maximal similarity is achieved 


ctors (test vectors in the 
lar pair scales). The degree of 
expressed 


second struct 
among the individual test ve 
present study are the 26 bipo' 
hieve maximal similarity is 
h may be regarded as a matrix 

ts of factor variables or 


rotation required to ac 
as a matrix of cosines whic 
of correlations between the two se 
factor vectors derived from the two analyses (12). For the 
purposes of the study, two factors were considered *mod- 
erately congruent” if one factor variable or vector 
accounted for between 50 and 75 percent of the variance 

d by squaring the value of the cosine 
.866). Factors with greater 
ariance (i.e-, cosine > 1 
uent.” The reader is 


in another, as estimate 
(i.e., cosine between .707 and 
than 75 percent estimated shared v 
866) were considered “highly congr ; 3 
cautioned that since no definitive standard is available in 


the literature, the categories “moderately” and “highly 
congruent” are nominal and therefore somewhat arbitrary. 
Moreover, the procedure of squaring the cosine Was 
employed to obtain a general estimate of the shared yari 
ance between factor variables. Since independent samples 
ion is made that such a procedure 
uivalent of the coefficient 
singlesample data, l'or 


Relate" in the present 


are used, no assump 
necessarily provides the exact eq 
of determination obtained from 
these reasons the use of “Program 


study should probably be regarded more as a descriptive 
rather than a statistical method. 


Results 
* Tables 2 and 3 show the respective intercorrelation 


hich components analysis was based. Princi- 
pal components analysis of faculty members” semantic dif- 
ferential ratings of senior students yielded five factors with 
alues 1.0. The composition of these five factors 
(i.c., the loadings of the bipolar pair scales on them) is 
shown in Table 4. Arrayed in Table 5 is the factor structure 
obtained from principal components analysis of senior stu- 
dents? semantic differential ratings of their class peers. As 
the table indicates, six factors meeting Kaiser's (11) vari- 
max criterion were yielded by this analysis. 
Table 6 displays the matrix of cosines derived from 
comparison of faculty and senior student factor structures. 
As shown in the table, a “moderate” or “high” degree of 
structural similarity was indicated between each of the 
faculty dimensions and five of the six individual student 
dimensions. Only Factor Il in the student structure 
little relationship with any of the factors 
in the faculty structure, thus suggesting where the most 
meaningful contrast between faculty and student dimen- 
sions might be made. On this factor two of the three scales 


matrices on W 


eigenv 


appeared to have 


(000mm 


30 
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Table 5.—Varimax Rotated Factor Loadings Derived from Senior Students’ Semantic Differential Ratings of 


Their Class Peers (N - 229) 
anna 
as ne 


VARIABLE I II III IV y VI n? 
stimulating/dull (11) .753 .257 .116 .072 102  -.100  .672 
deep/shallow (5) -706 -348 -060 .170 -.141 -.001 -673 
complex/simple (10) .683 .249 .028 -031 .113 -.044 .544 
reliable/unreliable (13) -661 +224 -.402 .126 -.007 -.109 -678 
strong/weak (6) -629 -328 -006 +314 -.091 -.088 -618 
candid/deceitful (3) -605 -047 -206 -165 -.431 -004 624 
intel lectual/non-intellectual (24) .544 .138 .180 .198 -.419 -.099 4572 
potent/impotent (4) -530 .226 .024 .193 .345 -.295 -576 
active/passive (8) -454 +152 .291 .264 -.161 -.372 .548 
Supportive/frustrating (26) .446 .278 .033 .286 -.413 -.225 .579 
flexible/rigid (18) -398 .296 .259 -299 -.114 -.222 .464 
Progressive/regressive (2) .198 .847 .114 .118 .109 -.112 .807 
good/bad (1) -203 .808 .017 .022 .116 -.065 4712 
sensitive/indifferent (16) .254 .731 .053 .160 001 -.063 632 
intimate/remote (17) -248 -728 — -.010 -015 — -.233  ..089 .653 
idealistic/practica] (15) .015 -009 -730 .239 -.177 -.242 .680 
impulsive/restrained (14) .179 -.001 -682 -.101 -.186 .322 645 
Stable/unstable (12) -263 — -.084  .:627 .215 -121 — -.060 — .534 
excitable/calm (9) .254 .122 -595 193 073 -.167 504 
accepting/critical (19) .053 2112 .096 849 -.086 093 762 
tolerant/intolerant (20) -292 .122 -075 .756 -.061 «119 .696 
Systematic/disorganized (7) -395 -007 = 138 .495 .052 .166 -450 
competitive/cooperative (23) -043 — -.022 -009 — ..169 .847 .097 757 
unstructured/structured (25) -145 -.073 349 -.102 =. 721 132 .696 
pragmatic/artistic (22) -.089 -.145 -.002 


tea, 1226 — 8.94 aso 1 31 — 662 
CUMULATIVE TOTAL VARIANCE 18.49 — 30.75 — 39,69 4846 56.80 — 63.42 
PERCENT COMMON VARIANCE 2019 1533 Ma DA 13:10 — 10.44 
CUMULATIVE COMMON VARIANCE 29.15 — 48.48 62.58 — jo 46 89.56 100.00 


NOTE: The number in Parentheses after each scale indicates the 
in the matrix of intercorrelations. number of that scale 


Table 6.—Matrix of Cosines Showing the Relationship between Faculty and Student Factor Structuresa 
FACULTY FACTORS 
I II Ill 


IV v 

I -854 (.73) -024 (.00) -.177 (.03) -.139 (.02) -.064 (.00) 
" II -508 (.26) -.081 (.01) -166 (.03) .154 (.02) .24] (.06) 
x 
o 
& | i -062 (.00) -.266 (.07) -745 (.56) 498 (.25) .072 (.01) 
& 
El -.051 (.00) -.880 (.77) -.430 (.18) .137 (.02) .M0 (.02) 
ue 

y ~+053 (:00) 322 (.10) 7.264 (.07) .323 (.10) .826 (.68) 


VI -.045 (.00) -.215 (.05) -362 (.13) -.766 (.59) .479 (.23) 
^FIGURES IN PARENTHESES INDICATE ESTIMATED PROPORTION OF SHARED 


VARIANCE BETWEEN FACTORS 
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selected to tap the “evaluation” dimension (good/bad and 
progressive/ regressive) loaded strongly with scales appear- 
ing to tap interpersonal “receptivity” or “sensitivity” 
(sensitive/indifferent and intimate/remote). In the faculty 
structure, however, the scales theoretically measuring the 
“evaluation” construct (good/bad, progressive/regressive, 
plus candid/deceitful) loaded highest on a dimension that 
appeared to be a measure of intellectual strength, scholar- 
liness, and curiosity (e-g.; strong/ weak; intellectual/non- 
intellectual; deep/shallow; stimulating/dull; active/passive; 
complex/simple; systematic/ disorganized; creative/uncrea- 
tive). 


Summary and Conclusions 

Comparison of the factor structures obtained from 
faculty semantic differential ratings of seniors, and senior 
students’ semantic differential ratings of their class peers 


suggested a substantial degree of overall congruence. Despite i 


this general tendency toward factor structure similarity, 
however, an interesting, and perhaps significant, contrast 
between faculty and student structures was indicated in the 
scales loading on the same factor with the “evaluation” 
construct. In the faculty structure the “evaluative” dimen- 
sion generally clustered with a dimension that appeared to 
measure students’ intellectual and scholarly orientations; 
while in the student structure, “value” clustered with a 
“receptivity /sensitivily ? dimension, which had a low corre- 
lation with scales measuring intellectual or scholarly traits. 
In short, it might be sug ested that faculty tend to 
respond favorably toward senior students largely in terms 
of their demonstrated intellectual capacities and orienta- 


tions, i.e. those traits associated with their formal role as 
a the other hand, 


students and potential scholars. Seniors, 0! 
would appear to value their peers more in personalistic and 


interpersonal than in formally academic OF professional 


terms. 
Considering the evidence that the vast majority of 
faculty-studerit con tacts are limited to formalized class- 
room transactions (6), itis not particularly surprising that 
faculty members associale à favorable response toward stu- 
dents with the presence of traits and orientations which are 


most appropriate to this context. Moreover, as Jervis and 


Congdon (1 0:466) point out: 


Concern with the intellect and associated activities towers 
d self images 


3 f iversit 
high in the need structures an h e ende 
professors. It is only natural that they would ten 


and to structure their worlds in these terms. 

of interpersonal interaction 
sof maturation during 
that students value 


Similarly, given the significance 
in resolving the developmental tas 
college (9), it would seemingly follow 
Sensitivity and intimacy in their peers 

Beyond this, however; the fact that stu 
Senior year associate value with interpersona 
intellectual or cognitive traits perhaps suggests the 


dents in their 
| rather than 
extent 


to which the ethos of the student peer culture successfully 
resists faculty influence on the norms and values dominant 
in student life outside the classroom. Skills, orientations, 
and traits that facilitate one's successful functioning in the 
formal role of “student” (characteristics which faculty 
value and attempt to foster) may be relegated to a status 

of considerably less worth in the peer culture milieu. By 

not reinforcing these behaviors or orientations which 
faculty value, the peer culture tends to significantly decrease 


the likelihood of their assimilation by students even into 
their senior year of college. 
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ABSTRACT 


ployed a yoked-S design in which the first S chose to perform any of five tasks, while the 
The results supported only the second hypothesis, A suggested explanation of the effects 
was presented. 


MUCH CURRENT literature in innovative education ad- 
vocates providing the learner greater freedom to learn what 
he wants, when he wants, and how he wants. Such an 


In the present study the relationships between the 
amount of freedom given the learner in choosing a learning 


task and his subsequent performa Ur MM | 
i i MR e nce and persistence on 
atmosphere of freedom is presumed to support the psycho- that task Wetah p nd pers 


logical growth of the individual learner and to facilitate the 
learning process per se. s 

While there has been a plethora of claims advocating the 
beneficial effects of “freedom to learn,” there has been a 
concomitant dearth of empirical research to support , 
thes 
effe 
examined. Sims (3) demonstrated that individual motiva- 
tion was vastly superior to group motivation in both read- 
ing rales and substituting digits for letters. Symonds and 
Chase (6) reported that in English usage tests intrinsic moti- 
vation to perform the task caused no extra learning beyond Method 
that of the practice effect of repetitive drill. However, 


estigated. It is reasonable to suspect that 
allowing an individual to select a task freely might yield re- 
sults similar to those obtained with inner-directed, intrinsi- 
cally motivated subjects. Presumably after an individual 
has freely chosen a task, he will be more committed to that 
task than an individual who is forced to perform that task. 
Thus, two hypotheses were tested. First, Ss who freely 
choose a task will perform better at that task than those 
who are forced to do it. Second, Ss who freely choose a 
task will persist longer at that task than those who are 
forced to do it. 


» claims. However, in research related to this issue the 
t of the source of motivation on learning has been 


these investigators admitted that the method used to gener- Subjects 

ate “intrinsic motivation,” showing the importance of cor. The Ss were 56 undergraduate students enrolled in in- 
rect English usage, was far from satisfactory. Kausler (2), in troductory psychology classes at Ohio Wesleyan Universily 
a study comparing the effects of ego-involvement versus who volunteered to participate in the experiment in order 
task-orientation on the DuBois-Bunch Learning Test, to satisfy a course requirement, 

found that the ego-involved group performed significantly ales 


better. Battle (1) compared inner-directed and other- 
directed junior high school students on a difficult mathe- 
matics problem and found that the inner-directed group 


Five standard laboratory tasks were selected for usc in 
this experiment. Each task was selected so that 5 could 
ia 
understand the general nature of the task based on a sine" 


persisted significantly longer. iss ze d P if- 
P E y i sentence description. Further, each task was made quite F 
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ficult to eliminate any possible ceiling effects in measures of 
performance and persistence. As evidence of the difficulty 
of the tasks, only 1 of 56 Ss reached criterion before quit- 
ting the experiment. The five tasks used were: (1) serial 
learning: 50 common words were presented individually to 
S for a brief interval using a memory drum. On the first 
trial S simply observed the words as they were presented; 
on successive trials he tried to recall each word in the cor- 
rect order before it appeared in the window of the memory 
drum. Criterion was errorless recall on two consecutive 
trials. The performance dependent variable was the number 
of words recalled correctly per trial; the persistence depen- 
dent variable was the total number of tials completed: 
(2) maze learning: S was blindfolded and given a stylus. E 
Placed S's hand at the start of a complex maze, then at the 
Boal. S was told he was to reach the goal as quickly and 
With as few errors as possible. When S reached the goal, E 
Placed Ss hand back in the start box. The criterion for this 
task was two consecutive trials through the maze without 
error, The number of errors per trial was the performance 
Measure; total time spent on the task was the persistence 
Measure; (3) anagrams: S was given a list consisting of four 
els of three-five letters in scrambled order. The first three 
" could be rearranged to form common English words; 
ns qnin in the fourth set could not be re 
" Y known English word. S was told to unscram 
ords and write the correct word adjacent to th 
version. The measure of performance was the average 
amount of time to unscramble each meaningful word; the 
Measure of persistence was the amount of time S spent E 
temp ting to solve the non-word anagram; (4) motor A " 
ing: S was told to press a handle at exactly ten lbs. of pres 
Sure. When S judged that he was pressing at this leya 
pa E, who recorded the level from a gaug? outside S* 
line of sight, E gave S feedback after each press (tria ) 


Lots shis task 
Ing him tH ned. Criterion in this tas! 
1e ned. Ort 
actual pressure obtair ve, of pressure 


arranged into 
ble the 
e scrambled 


trials was the measure of persistence; (5) probability learn- 
ing: S learned the order in which a series of lights was illu- 
minated. Because only two Ss chose this task, it was de- 
leted from statistical analyses and is not described in de- 
tail here. 


Procedure 

S was given a brief description of the five tasks, rated 
the tasks on 7-point scales that ranged from very unappeal- 
ing to very appealing, and then chose one of the tasks to 
perform. The first S of a pair then performed the task he 
had selected;.these Ss constituted the free choice group. 
The second S also performed this task, regardless of the 
task he had chosen. If the second S chose the same task as 
the first, they were both designated as free choice Ss, and 
the next two Ss not choosing that task became the forced 
choice Ss. S was then read the instructions appropriate to 
the task. In addition, all Ss received the following instruc- 
*Please work on this task as long as it is interesting 


tions: 
thwhile to you. If it becomes no longer interesting | 


and wor 
or worthwhile, please tell me and we will conclude the ex- 


periment. Are there any questions?” S then performed the 
n he had completed the task, he again 


appropriate task. Whe 
-point scale. 


rated the five tasks for appeal on a 7 


Results 

The performance 
and the associated means and 
free and forced choice groups are presented in Table B 
There were no differences between free and f orced choice 
Ss on the performance measure for three of the four tasks; 
on the motor learning task there was a tendency for foreed 
choice Ss to respond correctly more frequently than free 
e Ss (see Table 1). S's score was then converted to a 
on the distribution of 
1 of all free and forced choice Ss on these z- 
d no differences between these groups, 


dependent variables for the five tasks 
standard deviations for the 


choic 
z-score based 
A comparisor 
scores indicate 


scores for each task.? 


Was fi : 
oe five consecutive trials at exactly te! Pal presses 
h € ratio of the number of correct presse? to j ec t(51)<1. 
vas the measure of performance; the total num e 
Tab " n Four Tasks 
e I.— oice Ss 0 
oen sire ent Forel Forced SS N yoked 
= Free 9$ i 
X t airs P. 
M X s t pairs 
X & = 
T. = 
e: Dependent Variable gp 495 0.21 à ->10 
.33 i 
3.84 3 
> 14.79 0.27 5 ».10 
n. learning Words per trial 18.40 11.50 21.00 5 
is i 16.12. 18.37 0.18 8 >l 
leami r tria 11.17 à 
1 ing Errors pe and 14.62 m 185 š NT 
ABrams dise per solve Ple word (5 MET 0.16 œ 
imi ý 
Mot, 1 presses 
= ota 
r learning Correct presses/ t 
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Table 2.—Persistence of Free and Forced Choise Ss on Four Tasks 


JOURNAL OF EXPERIMENTAL EDUCATION 


Free Ss Forced Ss N yoked 
Task Dependent Variable x s x Li L4 pairs PR. 
Serial learning Number of trials 7.25 2.77 5.50 4.03 0.62 4 >.10 
Maze learning Total time (min.) 20.14 — 7.58 9.71 4.49 2.29 5 «.05 
Anagrams Time on unsolvable word (min.) 2.57 2.05 1.86. 1.36 0.77 8 ».10 
Motor learning Number of trials 69.67 55.95 50.22 23,12 0.49 9 ».10 
Table 3.- Ratings of Tasks before and after Performance 
Free Ss Forced Ss 
x s N X s N 
Before 5.96 0.76 26 4.19 1.24 26 
After 5.50 1.22 26 4.27 1.65 26 


Comparable data for the persistence measures are pre- 
sented in Table 2. Only on the maze learning task did the 
free choice Ss persist longer than forced choice Ss. How- 
ever, an overall analysis using z-scores derived from the in- 
dividual tasks found that free choice Ss persisted longer 
than the forced choice Ss, ¢(51) = 1.88, p € .05, 

Ratings of the performed task, taken prior to and after 
performance of the task, are presented in Table 3. Free 
choice Ss preferred the task that was performed signifi- 
cantly more than forced Ss, both prior to, t (51) = 6.80, 
p<.001, and following, t (51) = 3.02, p <.01, perform- 

ance of the task. However, ratings of free choice Ss de- 
clined following performance of the task, (25) = 1.77, 
p <.05, while those of the forced choice Ss did not 
change significantly, (25) <1. 


Discussion 


The results of this experiment provide some support 
for the hypothesized relationship between freedom of 
choice of a learning task and success at that task. Free 
choice Ss did persist longer at the selected task, even 
though they did not learn significantly more or faster, 
This conflicting result is particularly surprising in light of 
the fact that the measures of performance and persistence 
are not independent of each other. In general, longer per- 
sistence should result in increased performance, merely as 
an artifact of the interdependence of the two measures, 

That there were no differences in the learning condition 
does suggest a possible psychological mechanism for the 


facilitates performance. Conversely, 
Onse is not the correct response, then 
activation interferes with learning. 

. ar tasks used in this experiment, it is 
Impossible to determine the nature of the response hier- 
archy on any post hoc basis, If it is assumed that on some 
tasks the dominant response is correct, while on other 
tasks it is incorrect, the inconsistent performance differ- 
ences obtained would be explained. Thus it is suggested 
that freedom of choice increases the level of activity of an 
organism, and this results in increased task persistence. But, 
this increased drive level interacts with specific tasks and 
may facilitate or inhibit performance, 

Such an explanation also yields certain implications for 
the utilization of freedom of choice in educational settings: 
Such freedom will be most advantageous in curricula that 
are designed to Provide steady progress for the learner. In 
particular, the freedom to participate may be of great 
positive value for those forms of programmed instruction 
that utilize small Steps and strive for no errors on the part 
of the learner (e-g., Skinner [4]). 

Finally, these data may present a conservative esti- 
mate of the beneficial effects of freedom of choice. If a 
college student is required to participate in an experiment 


for a course requirement, as these Ss were, then the free 
choice group is not really that, but is “free” only relative 
to the forced Ss. An alternative procedure would be to 
give S the option of non-participation as a part of the ex- 
periment. Likewise, it is doubtful whether the tasks were 
very interesting or motivating for S to perform. More in- 
teresting and relevant tasks that provide an accurate assess- 
ment of both performance and persistence would be neces- 
sary in future research to determine the specific manner 
and extent to which freedom of choice can facilitate the 


„learning process. 


FOOTNOTES 


1. This research was conducted while the author was at Ohio 
Wesleyan University. The author thanks Harry P. Bahrick for 
his advice and encouragement throughout this research and help- 
ful comments on an earlier draft of this manuscript. 
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2. The signs of the z-scores for the maze learning and anagram 
tasks were reversed so that a higher z-score represented higher 
performance. 
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verbal and nonverbal evaluative communications to the students) 
questionnaire was used to assess the students" willingness to self-d; 


that teacher positive regard was related to stud illi 


tween verbal and nonverbal behaviors was not related to student willingn 


MUCH IS WRITTEN in teacher education materials 
about the role of the teacher as a facilitator of interpersonal 
relationships (2,10). She is expected to deal with students 
who are angry, upset, depressed, or unconcerned. In some 
schools she is expected to individualize instruction based 
upon the cognitive and affective characteristics of each 
child (2). As interest in “affective” education increases, the 
ssroom teacher is charged, in addition, with the res 


c ponsi- 
bility of enhancing the emotional development and self-es- 
teem of her students (12). 

It is not within the scope of this study to analyze these 
teacher roles or to determine their efficacy for student 
learning or development. Rather the focus is upon one small 
aspect of teacher-student interactions, variables affectin 
the willingness of the student to self-disclose to the teacher. 

In her effort to facilitate interpersonal relationships, 
gather information about affective characteristics, or en- 
hance self-esteem, a teacher may at times be hindered by 
the un willingness of a student to disclose personal informa- 
tion. Yet the variables affecting such student disclosure have 
not been systematically studied. Most research on self-dis- 
closure has examined the relationship between therapist and 


patient, Research in psychotherapy indicates that such 
therapist variables as empathy, positive regard, and congru- 
ence lead to high levels of patient self-disclosure (10). In 
situations other than therapy, Shapiro et al. (9) concluded 
that persons are most willing to disclose to others whom 
they perceive as most warm, congruent, and empathetic. The 
relationship between these attributes and self-disclosure has 
not been validated in educational settings, however. 

This study systematically examined the relationship be- 
tween two categories of teacher behavior (congruence and 
positive regard) and student willingness to self-disclose. 
These variables were investigated because they have been re 
lated to self-disclosure in other situations, and were defined 
operationally in keeping with the special characteristics of 
the classroom, One of these characteristics is that the teac! 


most frequently communicates to individuals in the entire 
class simultaneousl 


therefore, as the c 


16r 


y- Teacher positive regard was defined, 
ommunication of positive evaluative state- 
ments to the class as a whole. Examples include ‘I’m very 
proud of the way you are working," “This is a very good 
class.” Consistent with the work of Bugental (5), congruence 
in this study was defined as agreement among the verbal, 


~ tion suggest that students’ inte: 
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vocal, and facial components of spoken communication. For 
example, to be considered a congruent communication a 
teacher’s positive verbal statements to the class would have 
to be accompanied by nonverbal indicators of positiveness 
such as pleasant, friendly, soft voice tone, smiling face, 

open arms, and relaxed posture. 

The investigations of Davidson and Lang (6) and Kajita 
(7%) of the effects of self-esteem upon interpersonal percep- 
rpretations of teacher evalua- 
d to student self-perceptions. 


“tive statements might be relate 
investigated as an 


For this reason student self-esteem was 1 
independent variable. 

me because sex differences are frequently found in 
stadies of elementary age children on outcome variables 


as achievement (1) and incidence of behavior prob- 


such 
ted as an independent 


lems (3), sex of student was investiga 


variable. 


Method 


Hypotheses 

The independent variables were hypothesized to relate 
to student self-disclosure as follows: 
1. Willingness to self-disclose will be greater for high self- 


esteem students. 

2. Student willingness to sel{-disclose Lo the teacher will be 
directly related to teacher positive regard (positive eval- 
uative communications). Willingness to sel f-disclose will 
be greatest when both verbal and nonverbal communica- 
tions are positive, less when either verbal or nonverbal 

ther verbal nor nonverbal 


are positive, and least when net 
communications are posilive. 
3. Student willingness to self-disclose will be greater when 


the verbal and nonverbal behaviors of the teacher are 


congruent, 


4. Willingness to self-disclose will be related to student sex. 


Subjects 
The subjects were 80 fourth-grade students attending a 


predominantly middle-class public school in a large South- 


total of 140 fourth-graders attended the 
140 students, 21 
? by their home- 


western city. A : 
school. From this initial subject pool of 
children designated as “highly disruptive” by 
room Leachers were removed. This was considered necessary 
to minimize teacher-student interactions other than those 
to be manipulated by the experimental design. Subjects 


' Audents based upon 
were chosen from the rema students based up 


ining | 19 . 
in the “Procedures 


self-esteem scores and sex as described 
section. 
Procedures 


Six weeks prior to the experiment all fourth-grade o! 
in the school were given the Piers-Harris Sell Concept Seale 
stu- 


From the group of 119 


(8) in their homeroom classes. 
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dents described above, the 20 females and 20 males having i 
the highest self-esteem scores and the 20 females and 20 
males having the lowest self-esteem scores were identified. 


Five males and five females from the high self-esteem 
subgroup and five males and five females from the low self- | 
esteem subgroup were then assigned randomly to eachof | 
four treatment conditions. On the day of the experiment, 
student absences in three of the treatment groups required 
last minute substitutions. Consequently the final groups 
had male/female ratios of 11/9, 10/10, 11/9, and 12/8. In 
this final selection the mean Piers-Harris Self Concept score | 
of the 40 high self-esteem subjects was found to differ signif 
icantly from the mean for the low self-esteem group (t=18.1 
p <.001). P |! 

One female, age 24., served as the teacher in all four con- | 
ditions. She had received 28 hours of instruction, practice, 
and video-tape feedback in the presentation of positive and | 
negative evaluation via the verbal and nonverbal channels. A) 
no time during the study was she aware of the hypotheses bi ' 


ing investigated. 

For cach experimental condition the 20 subjects were 
removed from their regular classrooms to participate in an 
experimental micro-lesson. In a nearby vacant classroom — | 
these students were taught a vocabulary lesson in which the 
teacher pronounced, wrote on the board, spelled, and used 
in two sentences eight different vocabulary words appro- 
priate to the subjects? grade level. After the presentation of 
each word, the students were instructed to write as many 
“interesting and original sentences" as they could in two 
minutes using the vocabulary word. During this time the 
teacher walked around the room examining the subjects’ 
work,but not communicating with them in any way. 

Immediately after each two-minute work session and be- 
fore the presentation of the next word, the teacher rendered 
a two-sentence evaluation of the students. The varying of 
these evaluations across conditions was the experimental 
manipulation, and all other teacher behavior was held con- 
stant. 


In Condition 1, the positive verbal, positive nonverbal 
condition, the teacher's positive verbal evaluations of the 
class (for example, “You're writing very interesting senten- 
ces. This must be a smart class.”) were accompanied by 
positive nonverbal communications such as a pleasant 
voice tone, smiling face, open arr and relaxed posture. 
Condition ILpresented a positive verbal, negative nonverbal 
teacher. Positive evaluative statements were accompanied by 
angry voice Lone, frowning face, closed arms and rigid pos- 
ture. In Condition III the teacher gave negative verbal eval- 
uations (for example, “Youre not writing very interesting 


sentences. This must not be a smart class.) accompanied 
ments of pleasant yoice-tone, et 


ative verbal statements and 


by the positive nonverbal cle 
Condition IV contained only neg 
negalive nonverbal communications. No student questions 
or comments occurred during any of the conditions. All 

conditions were videotaped and later rated by four judges 
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| independently. The judges attained 100% agreement in 
assigning each tape to the correct condition. f I 
| In every condition, following the last evaluation, the 
| teacher left the room and the experimenter administered to 
«| the students the instrument designed to assess willingness to 
| self-disclose. This 16-item inventory was designed by one of 
| the authors and is available from him. Each item is of the 
| form “YES NO 1. If Mis —— — 


were my teacher, I would 
want to talk with her about... 


"The endings varied from 

more superficial subjects (my favorite games) to more inti- 
mate subjects (what I don't like about myself). A subjects 
self-disclosure score was simply the total number of items 

| to which he responded “yes,” 


Results 


Hypotheses 1 and 2 


An analysis of variance performed on the self-disclosure 
scores revealed no effect for Student self-esteem, As indi- 


Table 1.—-Analy. 
disclosure Scores 
Conditions of Tea 


sis of Variance Com 


of High and Low Self-esteem Sub 
cher Positive Regard (280) 


cated in Table 1, high and low self- 
equal willingness to self-disclose to the teacher. 

A main effect for teacher Positive regard was found (see 
Table 1). Table 2 summarizes the findings of t-test compar- 
isons of the means of the four conditions. As predicted by 
hypothesis 2, the students in the most negative condition , 
(IV) were the least willing to self-disclose. When the mean o 
this most negative condition (IV) was compared with the 
means of each of the other conditions, the difference be- 
tween the means was found to be significant beyond the 
-05 level for cach comparison. However, Table 2 also indi- 
cates thal the means of the four conditions did not become 
increasingly smaller as predicted by Hypothesis 2. The t-test 
comparisons revealed no significant differences among the 
means of Conditions 1, II, and III. 


esteem students expressed 


Hypothesis 3 


In order to determine 


whether congruence between 
teacher verbal and nonve 


rbal behavior was related to student 


Parison of Student Self- ' 
jects by Four 


ee RE 


an Square 


doe, F-ratio p 
St. a 


Between 82.107 T 
Self-Esteem .050 I -0010 -9734 
| Teacher Positiveness 144.183 3 2.8982 . 0400 
| Self-Esteem x Teacher 
| Positive Regard 47, 383 3 952), 4217 
| Within 


Treatment Condi 


Condition Mean 


ions and z- 
f-disclosure Instrument 


49.750 72 


Standard Deviat 
tions on the Sel 


Ratios for the Four 


SD T-ratios 
Score ipo m y 
I Verbal + Nonverbal + 19,3 5.40 


II Verbal + Nonverbal - 


13.0 
III Verbal-Nonverbal + 15.3 
IV Verbal-Nonverbal - 8.9 


*p < .05 


.138 .981 2.25* y 
8.12 -939 1.76% 
7.35 2.8l9* 
6.85 
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willingness to self-disclose, the mean of Conditions I and IV 
combined (congruent conditions) was compared with the 
means of Conditions II and III (incongruent conditions) 
combined using a t-test. This difference between the means 
approached significance (¢=3.61, p € .06), but was in the 
direction opposite that predicted by Hypothesis 3. There ap- 
peared to be a tendency for subjects in the incongruent con- 
ditions to express greater willingness to self-disclose. 


Hypothesis 4 

An analysis of variance of self-disclosure scores by sex 
and teacher positive regard revealed a main effect for sex 
(F=5.60, df=1, pL .02) and a significant interaction between 
sex and teacher positive regard (F=3.21, df=3, pL .03). As 
depicted in Figure 1, male and female students indicated 
relatively equal willingness to self-disclose in Conditions 
I, II, and III, while the scores of the male students in the 
most negative condition (IV) were substantially depressed. 
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Figure 1.-Compariso 
sclose. 


students’ willingness to self-di: 


Discussion 


Results indicate that congruent behavio s 
teacher is not related Lo student willingness to self-disclosc 
is finding was contrary to 


to her after an initial encounter. Th i " 
the experimental prediction and to previous research. T E . 
brevity of the micro-lesson prohibits rejecting the hypothesis 
that continued exposure to a ose verbal and non- 
verbal behaviors are incongruent is related to student willing 
ness to self-disclose. The very limited exposure of students to 


r of an unfamiliar 


teacher wh 


the teacher may have operated to establish a ceiling for self- | 
disclosure scores, thus restricting the variability of the data. | 
The general applicability of this finding should be regarded | 
with some suspicion until further research providessome | 
corroboration. | 

The interaction effect which was found for student sex 
and teacher positive regard showed that when the verbal an 
nonverbal evaluative behavior of the teacher was clearly neg- 
ative, male students expressed an unwillingness to self-dis- 
close. They responded “yes” to an average of only 5 out of | 
16 items on the self-disclosure scale. Males in the other | 
three conditions and females in all four conditions indicated! 
relative willingness to self-disclose, regardless of teacher | 
evaluative behavior. ' 

Several factors may account for these findings. It is pos- | 
sible that female students have learned by this age to con- 
form more readily than males to the implicit expectations 
of the school. Thus for females, norms which dictate coop- 
eration and manifestation of trust may outweight the ef- 
fects of even the most negative teacher behavior upon their 
expressed willingness to self-disclose. 

Another possibility is that this finding results from a 
greater willingness among students to disclose to a very 
negative teacher when that teacher is of the same sex. The 
test of this hypothesis would require another study in whicl 
male and female students were exposed to teachers of both 
sexes. Such an experiment would allow for the investigation 
of questions which cannot be addressed within the limita- 
tions of the present experimental design—those concerning 
the effects of sex of teacher. 

Still another explanation is suggested by research on the 
effects of student sex on teacher punitiveness (4). It has bee 
a clear finding that males are warned, criticized, and pun- 
ished by the teacher more frequently than are females. This 
suggests that males may come to see unambiguous negative 
evaluation as a prelude to punishment. They thus would be 
more likely to avoid contact with teachers who might in- 
flict punishment upon them. 

Very little systematic investigation has been conducted 
on the role of nonverbal behavior in the classroom (14). The 
microteaching paradigm used in this study was chosen to 
provide an optimal combination of control and correspon- 
dence to classroom experience. Thus, it was not assumed 

that the 20-minute sample of behavior surveyed represented 
a typical cross section of classroom activity, but rather the 


closest approximation which would allow for experimental 
manipulation of the variables of interest. Clearly there are 
questions about whether one can generalize from these find- 
ings. Ultimately, any hypotheses developed by this form of 
research must be tested through an examination of teacher- 
student interactions in regular classroom settings. 
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BEHAVIORAL COMPONENTS OF 


SCHOOL READINESS 
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ABSTRACT 


r ehavioral changes 
children (CA = 3.9 to 4.9). Change during a three-month i 


tion, self-control, and risk taking was analyzed using m 


i eschool 
i Ty preschool program in self- ifica- 
ultiple linear regression, F, i T concept, delay of grati! ic 
tion was significantly related to growth in school readiness, while for the boys, uera i Erowth in self-concept and delay of gratifica- 
school readiness. change in sel 


READINESS AS defined by Ausubel (1) refers to “the 
adequacy of existing capacity in relation to the demands of 
' (p. 246). Maturation according to 
Ausubel has a different and much more restricted Meaning. 


a given learning ta 


It represents development that takes place independently 
of prior learning. A child's level of school readiness depends 
not on maturation alone but on varying proportions of 
maturation and learning (1). Thus a major component of 
the child's lack of readiness for school is due to gaps or in- 
adequacies in his prior learning. 


Difficulty occurs whe 


n readiness is confused with the 
conce 


Pt of maturation, In the 
the child's intelle 
enrichment prog, 
separate 


present study a measure of 
ctual maturation at the beginning of an 
ram was statistically controlled in order Lo 
i the child’s school readiness due 
from his school readiness due to intellect 
during the Program. Specifically, the 
cerned with the pe 
child which are rel 
ence, 


to prior learning 
ual maturation 
present study is con- 
rsonality characteristics of the migrant 
ated to school readiness due to experi- 
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In contrast with previous research, the present study is 
concerned with achievement due to behavioral characteris- 
tics rather than the child’s intellectual development. 

Thus, determination of the nonintellectual characteris- 
tics of migrant children that are involved in development of 
the child’s readiness for academic material was the purpose 
of this study. Removing the role the child’s intellectual 
development (cognitive skills) plays on his readiness for 
school eliminates any confounding relationship that may 
exist between personality and intelligence. Mischel (10), 
for example, states that most personality measures relate 
more highly to intellectual measures than to different meas- 
ures of the same trait. 

Tt was hypothesized that development in the migrant 
child's self-concept, in his ability to delay his gratification, 
exercise self-control, and take moderate risks would be 
associated with growth in school readiness. Support for this 
hypothesis is provided by research using not only preschool 
children but research using older children and adults. The 
characteristics of children who display more achievement 
behavior were studied through a longitudinal investigation 
(12) based on standardized tests and ratings of children’s 
behavior in nursery and elementary school as well as in the 
home. The authors determined that girls whose IQ's in- 
creased during the preschool years were able to delay grati- 
fication of their desires until some future time. Mischel (9) 
also found a significant relationship between preference for 
immediate smaller, or delayed larger, reinforcement in 
choice situations, and “n » Achievement (responses to pic- 
tures scored for achievement motive). Mischel’s sample con- 
sisted of 112 Trinidadian children between the ages of 
eleyen and fourteen years. Haggard (5) found that high 
achievers had better self-control than equally gifted children 
who were not achieving at such a high level. 

Support for a relationship between self-concept and 
achievement was provided by Crandall, Katkovsky and 
Preston (4), who assessed the amount of time elementary 
school-age children chose to spend in intellectual activities 
during {ree play time while at a summer camp. Boys who 
predicted their own success in intellectual activities spent 


more time engaged in intellectual activities, while no such 


relationship was obtained for the i s 
Clelland (8) examined the r 'ationship between n 

Achievement (response to pictures scored for achievement 
motive) to risk taking in 26 children in kindergarten and : 
32 children in third grade. In both groups of subjects, indi- 
viduals with high “n” Achievement tended to take moder- 


i 4 i em jevement pre- 
ate risks, while students with low n Achievement p 


ferred either safe or speculative enterprises. 
In contrast with the reviewed studies which examined 

y of one another, the present study 

r in predicting 


girls in the study. Mc- 


behaviors independent 
examines the characteristics acting togethe 
achievement. This makes it possible in the present study to 
determine whether the characteristics are independent » 
one another. Due to sex differences found in the reviewed 
Studies, a separate analysis was conducted for each sex. 


Method 


The behavioral measures used in the study were admin- 
istered three months apart by four black female psycho- 
metrists. The subjects were 132 four-year-old (three years, 
nine months—four years, nine months) black migrant chil- 
dren (71 girls and 61 boys) participating in a program of 
compensatory education being conducted for preschool 
children of migrant workers in south Florida. Total enroll- 
ment in the program was 300 students, but pre- and post- 
most in the South due to the nature of their migrant lives. 
attendance of less than three months. It was thought that 
these children are representative of the rural culturally de- 
prived. Possibly, these children are even more deprived than 
most in the South due to the nature of their migrant lives. 

The pre-test measures were administered approximately 
two weeks after the child entered the program, with the 
post-tests administered three months later. All of the in- 
struments administered to measure the variables under 
study are available from the Educational Testing Service of 
Princeton, New Jersey. Several of the measures are experi- 
mental devices lacking in validity and reliability data. A 
brief description of each of the measures follows the be- 


havior they measure. 


Cognition 

The cognition measure included the sum of correct re- 
sponses on the following three measures. These measures 
are thought to tap the major components of cognitive de- 
velopment in preschool children. 

ETS Matched Pictures Comprehension Task measures 
listening and recognition of word and sentence properties. 


The measure was developed to meet the need for a series 
of syntactically structured tasks which would require mini- 
mal responses from the child (i.e, pointing). The tasks con- 
sist of a “Matched Picture? presentation of 20 cards con- 
taining pairs of stimulus pictures. Both pictures contain 
similar elements, but they depict different relationships. 

The examiner asks the child to point to the similar elements. 

ETS Story Sequence Task, Part ll measures speaking, re- 
telling, comprehension, and creative speech. The test mate- 
rials consist of two sets (three and four cards each) of car- 
le sequences using animals as characters. The 
examiner tells the subject to listen carefully to the story be- 
cause the subject is to repeat the same story. The subject’s 
version of the story is recorded on tape for later scoring and 
interpretation. 

Matching Familiar Figures measures the child’s reflection- 
impulsivity tendencies. The subject is shown a set of four 
pictures, then a single standard. His task is to identify the 
one comparison figure among the four that is identical to 
the standard by pointing to the correct figure. 


Delay of Gratification 
The Mischel Technique measures ability to delay gratifi- 
cation. The subject is shown two rewards (candy) and is 


— 


JOURNAL OF EXPERIMENTAL EDUCATION 
42 


ld that he can have the smaller one now or the larger one 
pe me later period (specified by the Examiner). He is 

26) 
ied whether he wishes the smaller or the larger of the 


two items. 


Risk Taking 
The first task in the Risk Taking measure consists of 
showing the child two bags. The child looks into the first 
bag and sees a toy (car) in it. He is told that the other bag 
may be empty or may have five toys in it. The child is then 
asked if he would rather have the car or the other bag. If 
the child selects the bag, the game is over. If the child se- 
lects the car, then he is shown the contents of the bag and 
is asked to choose another bag. The same choice situation 


is again presented to the child. If when first asked he selects 
the bag that may have five 


however, if he selects the b 
one point. 


ag the second time, he receives 


Self-concept 


Brown IDS Self-concept Referents 
child's perception of self. The proced 
photograph of each subject to use in 
questions about his picture. 
a score of one; each negative 


Test measures the 
ure involves taking a 
asking the subject 
Each positive response receives 
response receives a score of 
zero. The child is asked to respond with a "yes" or “no” as 
to whether the child in the picture is "happy," "clean," 
“ugly,” “talks a lot,” “good,” “scared,” and so forth. 


Self-control 


Motor Inhibition measures the child’s self-control. The 
child performs two motor acts: drawing a line between two 
points and walking a distance of six feet. He practices each 
act and then is timed as he performs it as slowly as 
The child’s score is the time it takes the child ‘to complete 
the drawing. The Motor Inhibition Ability Test was intro- 
duced by Maccoby, Dowley, Hagen, and Degerman (7). 


he can. 


Dependent Variable 


Cooperative Preschool Inventory (CPI) measures general 
knowledge, listening for word meaning and comprehension, 
writing (form copying), speaking, and quantitative skills, 
The CPI was designed as an assessment procedure for use 
in individual testing of children age three to six years (3). 
The CPI consists of 85 items which were selected on the 
basis of a principal components factor analysis, Williams 
and Stewart (13) reported a reliability of .93 (coefficie 
for a sample of 445 children attending a summer Head 
Start program. The author obtained a coefficient a 
reliability of .88 for the CPI administered to 19] migrant 


preschool children. 


nt a) 


Statistical Procedure 


Multiple linear regression analysis (2) was used to ex- 
amine the relationship between growth in the four traits 


loys in it, he receives two points; 


and growth in school readiness not accounted for by the 
intellectual development (cognition) upon entering the com- 
pensatory program. The analysis conducted examined the 
relationship between the growth in each trait and the : 
growth in school readiness with all traits acting together in 
predicting achievement. " —" 

The pre-test measures on the characteristics being stuc ie 
were used as covariates to the post-Lest measures instead of 
using gain scores. This approach provided a more reliable” 
measure than the use of gain scores. The measure of cogni- 
tive development (combination of three ETS tests) was 
taken with the other pre-tests. 


Results 


Regression analysis allowed the investigator to estimate 
the Proportion of criterion variance that can be accounted " 
for by the complete system of the five traits (including cogni 
tion) and by each individual trait, An examination of the 
unique contributions of the five variables to the prediction 
of achievement was made by comparing the complete aye 
tem R? value to the prediction system with one of the yan 
ables omitted. It is an estimate of the independent con tribu- 
tion of the omitted variable and may be evaluated by means 
of the F statistic, 

The variables used in the regression analysis were: 
(a) cognition (variable 1); (b) delay of gratification (vari- 
ables 2, pre-test and 3, post-test); (c) risk taking (variables 
4, pre-test and 5, post-test); (d) self-concept (variables 6, 
pre-test and 7, post-test); (e) self-control (variables 8, pre- 
test and 9, post-test); and (f) the criterion variable, the 
post-test CPI score (Y) with the pre-test CPI measure used 
as covariants (variable 10). 

On examination of Table | 
control for the boys accounte 
of the achievemen 
finding indicate 


» itis apparent that only self- 
d for a significant proportion 
t growth variance (6.5 percent). This 


M \ : th 
s that a positive relationship between grow 
in achievement and 


school migrant boy 
Results for the 


change in delay 


growth in self-control exists for pre- 

s attending a compensatory program. 
girls attending the program indicate that d 
of gratification and self-concept accounte 
for a significant proportion (4.7 and 23.2 percent, respec- 
tively) of achievem 

Table 2 prese 
standard deviati 
amination of th 
delay of gratific. 
mean of the 
who were Je 


ent growth variance, 

nts the pre- and post-test measure and " 
ons for the measures used in the study. Bre 
* means revealed that the mean of the girls 
ation post-test measure was lower than the 
pre-test measure, This indicated that the girls 
ss likely to delay gratification on the post-test 
measure than on the pre-test 


measure were the ones who 
gained the most in achieve 


ment. 


Discussion 


The finding with the most Statistical and potentially á 
practical importance was the relationship between E 
in self-concept for girls. Sears (1 1) found that boys wer 


Table 1.—Regression Equations for Full Prediction System and for Full Prediction System Minus Post-test Measure for Each of the Four 
Variables Believed To Be Associated with School Readiness 


Variables Believed To Be Associated with School Readiness 00 
———————— 


K 
l 


Y= 


= .001x, + .675x 


= .007x4 + -675X10 = -277X65 = .048xX3 = .016xg = .039x, 
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Boys (N-61) 
10-Variable Prediction System (R7=.649) 
10 7 :277x6 - .048x, - .016xg - .039xg 


.005x., = .009x4 + .244xX9 * -000x, 


9-Variable Prediction System with Risk Taking Omitted (02-649) 


.005x4 = .009x4 $ .244xg 


9—Variable Prediction System with Self-Control Omitted (R2=.604) 
.014x4 + -714x) 9 - -230X65 - .075x, + + 068x, - .039x, + 


.000x- - .030x4 - .007x5 


9—Variable prediction System with Delay of Gratification 


omitted (R-649) 
.006x, + -677X10 - -273X65 - .048x, - :017xg - .040x, + 


~006x%— + .244xg * .00xg 


7 


9-Variable Prediction System with Self Concept Omitted (n2-.649) 
.007x, + -675X19 7 -277x65 - .047x, - .016xg - .039x9 - 


.009x4 + .244x4 + -000x, 
Girls (N=72) 


10-Variable Prediction System (n2-.699) 
.179x4 + .506x15 + .051xg - .082x, - .050xg - .113xg + 


.550x, - .236x4 + -154x, - -050x, 


9-Variable Prediction System with Risk Taking Omitted (R2=.696) 
.174x4 + .509 X19 + .038xc — .085x, - .045xg - .ll6xj * 


.554x, - .232x, + 18236, 


tion System with Self-Control Omitted (82-. 681) 


is iable Predic 
9-Variab 010x, - 094g, + .035x, + .134x9 + 


.188x, * .514X1p + . 


sS4ix, = .230%5 = .046X5 


9-Variable prediction System with Delay of Gratification 


Omitted (R 2,651) - 
141x, $ 535X40 + -045X56 - -0063 .303xg 


533x. - .144xg - .041xg 
i icti ith Self-Concept Omitted (R 2,469) 
9-Variable Prediction System wi 

+. e 3X. = 
CUIXQ + .360x49 + .202X¢ + .025x, 000x, 9 


208x, + .151x, - ,100xg; 


- -090X, + 


SS Se eee 


C" 


MEM o 


Table 2.—Boys (N=61) and Girls (N= 
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72) Pre- and Post-test Means and Standard Deviations 


I RENE El li 


Girls 
| Trait 
Hl X SD X SD 
| Pre Post Pre Post Pre Post Pre Post . 
| 
Self-Concept 9.36 13.86 2.53 2.64 9.49 13.61 2.73 2.68 
j Delay of Gratification .30 .44 .24 .22 .57 AF  .30  .33 
Risk Taking 1.58 1.64 -61 .68 1.82 1.81  .63  .64 
Self Control 28.32 44.54 7.88 8.23 26.59 34.03 8.06 7.45 
Cooperative Pre- 
School Inventory (cpr) 29.62 42.40 9.42 10.01 35.70 48.83 8.67 9.08 
Cognitive Development 19.82 7.48 


evaluations of their achievements, would not be as likely to 
increase in the accuracy of their self-concept during the 
program as would the girls, 

The negative relationship between delay of gratification 
and gain in preschool readiness Supports Maccoby’s (6) 
theory of a curvilinear relationship between activity level 
and achievement. Extreme activity or inactivity, according 
to Maccoby, would result in few or too fleeting contacts 
with experience, respectively, but that towards some mid- 
point there would be sufficient interaction with the environ- 
ment, at an appropriate “rate,” for learning to take place. 
Thus there is an optimum point on the activity dimension, 
and this is identical for both boys and girls, Partial support 
for this position can be found in Table 2, where after three 
months in the compensatory program, the girls’ and boys’ 
post-test delay of gratification means are identical. The 
finding of a curvilinear relationship as predicted by Maccoby 
(6) was not possible in the present study because the Mis- 
chel measure of delay of gratification is a single binary re- 
uc the negative relationship between delay of gratifi- 
cation and school readiness for girls is interesting, a replica- 
tion of these findings will be needed before any major 
change in the preschool program should be considered. The 
other findings have been supported by previous research 
and indicate a need to examine the nursery school curricu- 
lum to determine how best to develop these characteristics, 
Development of these affective characteristics of the pre- 
school child should complement development of the child's 
cognitive skills. It would be unrealistic, for example, to 


have a unit on self-concept development, Instead, every 
activity the child engages in during the program could be 

an opportunity for the child to gain self-esteem. This will 
require the teacher and aides to examine their behavior with 
the children to determine what changes are necessary to 


for an art project, for example, the teacher should consider 
along with her major objectives for the project how to use 
this Opportunity to encourage and reward the development 
of impulse and control in the boys and at the same time 
promote more “outgoingness” in the girls. 

These suggestions are tentative, yet they do deserve ap- 
plication and study. It is Suggested that differential treat- 
ment on the basis of sex in the area of impulse control be 
applied to nursery school children in a controlled setting. I 
the preschool children in the experimental class show mers 
gain in achievement than the control class, then substantial 
Support will be provided for the speculations presented. 

nly then can we infer a causal relationship between im- 
pulse control and achievement, 


REFERENCES 


: n 
» "Viewpoint from Related Disciplines: HC 
Growth and Development,” Teacher College Record, 60: 
959. 


spear RE 
2. Bottenberg, R. A.; and Ward, J. E., "Applied Multiple Lines 
Bression," 6570th Personnel Research Laboratory, ABO T 
Medical Division, Air Force Systems Command, Lacklan 
Force Base, Texas, 1963, t 
3. Caldwell, p. M., The Preschool Inventory Technical Report. 
Educational Testing Service, Princeton, N.J., 1967. al 
^. Crandall, Vis Kaikovsky, W.; and Preston, A., “Motivation 
and Ability Determinants of Young Children’s Intellectu 661; 


Achievement Development,” Child Development, 33:64 
1962. 


"m ent in 
5. Haggard, E, A., “Socialization, Personality and Ad 957. 
Gifted Children,” School Review, Winter Issue:3 18-414, 


FLYNN 45 | 


6. Maccoby, E. E., The Development of Sex Differences, Tavistock, 10. Mischel, W., Personality and Assessment, Wiley, New York, 1968. 
London, 1967. o 11. Sears, P. S., “Correlates of Need Achievement and Need Affilia- 

7. Maccoby, E. E.; Dowley, E. M.; Hagen, J. W.; and Degerman, R., tion and Classroom Management, Self-concept, Achievement 
“Activity Level and Intellectual Functioning in Normal Pre- and Creativity," unpublished manuscript, Laboratory of Human 
school Children," Child Development, 36:761-770, 1965. Development, Stanford University, Stanford, Calif., 1962. 


8. McClelland, D., *Risk Taking in Children with High and Low 12. Sontag, L. W.; Baker, C. T.; and Nelson, V., “Mental Growth 
Need for Achievement," in J. Atkinson (ed.), Motives in Fan- and Personality Development: A Longitudinal Study," Child 
tasy, Action. and Society, D. Van Nostrand Co., Princeton, NJ., Development Monograph, No. 2, 1958. t 


13. Williams, R. H.; and Stewart, E. E., “Some Characteristics of 
Children in the Head Start Program,” Project Head Start, 
Final Report, Section One, Educational Testing Service, Prince- 
ton, N.J., 1966. 


1958, 306-321. 
9, Mischel, W., “Father Absence and Delay of Gratification, Cross- 


cultural Comparison, Journal of. Abnormal and Social Psychol- 
ogy, 63:116-124, 1961. 
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A MULTIPLE-CHOICE EXAMINATION 
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ABSTRACT 


„Z2) between two grades Z; and Z2 on a multiple-choice examination, as introduced by Krutchkoff (1), 
ormal approximation to the distribution of W, the number of answers (out of N) known by the student. Thus, the 
was a derer * derived by Krutchkoff in a series form involving the standard normal distribution 
ded üd the resulti. d higher probabilities at both tails of the distribution. In the present paper, an accu- 
rate closed form for P(W/Z) is derived which permits calculations to be performed on a programmable desk calculator. This closed 
i i ith parameters N and pj, the proportion of subject matter known by the 


form is based on an exact distri i 
student. Hence, we recalculate values of P(W/Z) and S(Zi ,Z3) for Krutchkoff's example. 


normal probability (or W <1 accumulated at W = 0 and 

the total normal probability for W>N — 1 accumulated 

at W = N. As a result of this approximation, the condition- 
al probability function of W given Z, P(W/Z), (Table 3 of 
Krutehkoff [1 ]), showed higher probabilities at both tails 
of the distribution. Another approach, mentioned by 
Krutchkoff (1) but not employed, is to use a renormalized 
normal distribution truncated at W = 0 and W = N. The ap- 
oach to be considered here seems more realistic than this 
ysed form solutions. 


THE SEPARATION LEVEL of grades on a multiple- 


e probabilistic criterion 


choice examination as a quantitativ enit 
by the examination 


for correct classification of students " 
was introduced by Krutchkoff (1). The separation level 
SZ, :Z4) of two grades Z, and Z; with Z, € Z; is the 
Probability that a student with grade Z, knows a greater 
Proportion of subject matter than a student with grade A. 
Krutchko ff’s dernvatian of $(Z,,22) i$ based on the nor- 


mal approximation to the distribution of W, the number pr i ; 
of answers (out of N) known by the student, with the total latter approach and leads to simple cle 


EEEE 
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In the present paper, the separation level is derived 
based on an exact distribution of W as a binomial random 
variable with parameters N (the total number of questions 
on the examination) and p; (the proportion of subject mat- 
ter known by the student). It turns out that P(W/Z) can be 
obtained in a closed binomial form which can be directly 
calculated. The present results can be applied to Krutch- 
koff’s data (Table 1 of reference 1) and accurate values of 
P(W/Z) and S(ZZ ), as given in Tables 1 and 2 of the 


present paper, can hence be recalculated. 


Derivation of P(W/Z) and S(Z, ,Z, ) 


Let us set: the number of answers known = 
ber of correct guesses = 


swers = Z; so that 
Z=Wry. 


W, the num- 
Y, the total number of correct an- 


(1) 


Our derivation of the conditional probability function of 
W given Z is based on the following two assumptions: 


l. The probability distribution of V is binomial (N,p,), 
where p; is the proportion of s 


ubject matter known 
by a student. 
2. The probability distribution of Y is binom 


ial (N — W, 
p), where p = I/n, and n is the number ch 


oices for each 
f the N questions. 


It follows that 


P(w) =Pr(W=w) = (yee c w-0,1...., N; 


47 1-p, Q) 
Plow) = Pr(Z=2/Wew) = (N-W - woz. _ 
wtlLl...,N;q-1—p. (3) 
Hence, we have 
zZ 
Plz) = Pr(Z=2) = X P(z/w)P(w). (4) 
It can be directly verified that 
And e ye) (5) 
and hence Equation (4) takes the form 
Plz) = (Vp aN X. dea ye 
hy Pis... 
= (Npr ah a tpg 27 =01,...,0 (6) 


By Bayes’ inversion formula, Equations (2 
(6), we thus obtain 


P(w/z) = P(w) P(z/w)/P(z) 


) (3), (5) and 


w Pi us 
- uisi" a ^10 
= (purr, w=0,1,.. z © 
where Q - 1 — P, and 
[5 MEM NND 1 
P *pq, 1+(n 1)pi, since p 27. (8) 


Equation (7) shows that the probability function of W given 
Z is, in fact, binomial with parameters z and p as defined 
in Equation (8). 
Estimation of p; and P 
By Equation (3), we have 
E(Y/w) = (N—w)p, and hence by Equation (2 (9) 


E(Y) = (N-Np, )p (10) 
It follows that 
E(Z) = E(W) + E(Y) (11) 


=Np,+ (N—Np,)p, so that 
-E(Z) — Np 
pi-————r 
N(1—p) . Q2) 


Thus, estimators of p, and P are 


A nha WV ina Bi. "P. 13 
1 Nac) 4 Toq yg we 
Application 


Upon applying the above results to Krutchkoff’s (1) ex- 
ample, the following results are obtained: 


N = 60,n=4,Z = 25.85, P) = 2411, P= .5596. 


Table 1.—The Conditional Probability Function P(W/Z) 


F(16) DQ1) CQ6) B(32) AGI) 


WAIDAARWNHO 


All omitted entries are zero to at least four significant places- 
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Table 2.—Separation Levels S(Z;, Z5) 
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Z2 
Zi DQ1) C26) 


B(32) A(39) 


-7765 9435 
-7501 


-9931 -9995 
.9413 .9937 
7747 .9546 

7928 


Hence, we calculate by Equation (7) the binomial probabil- 
ity function P(w/z) for z = 16, 21, 26, 32 and 39. The re- 
sults are given in Table 1, to be compared with the 

results of Table 3 of Krutchkoff (1). 


Now, the separation level of z, and zaz < z5J is 


8(2,,22) =Pr(Wi< W2 [zı € 23) 
z min(i-1,21). X 
- X P(W, = ilz2) È P, = jlzi) 
1 je. 
Using Table 1, we calculate the values of S6, 2) given in 
Table 2 . 


Conclusions 

By using a binomial probability mass function for the 
known (without guessing) number of questions, a very sim- 
ple formula is obtained for P(W/Z). Although this does not 
significantly alter the previous conclusions concerning the 
exam, it does permit the calculations to be performed on a 
programmable desk calculator. 
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ABSTRACT 


For “sensitive-area” questions in interviewing, random res; 


ponse techniques can be useful in establishing the types of questions 
to be asked and the methods by which the respondent can confidentially id. 


response techniques (the dichotomized question, thi 


e unrelated question; and the single question/random answer), the simple 
probability theory that makes the techniques Work, and some questions about thi 


educator. 


RESEARCHERS ARE well aware of the pitfalls of non. 
sampling error, particularly when personal interview tech- 
niques are used. Surveys on human populations have estab- 
lished that an interviewee's refusal to respond to "sensitive. 
area” questions or his intentional giving of incorrect answers 
to the questions are more likely the rule rather than the ex- 
ception. Several reasons for the giving of non- and/or incor- 
rect answers by the interviewee have been advanced: 

(1) modesty; (2) fear of being thought bigoted; (3) reluc- 
tance to admit to unlawful or socially deviant behavior; and 
(4) reluctance to confide intimacies to strangers. 

For whatever the reasons, many individuals fail to answer 
interviewers’ questions. This so-called “non-cooperative” 
group (12:235-272) includes two types of individuals and 
results in two Ly pes of non-sampling error: the non-answerer 
who creates “refusal bias" (2:355-361; 12:261-269), i. Bs 
failing to respond, and the incorrect-answerer who creates 
“response bias” (7:280-325), i. es purposely providing in- 
correct answers. These types of bias error can be quite ser- 
ious. Cochran (2:235-245) has idvised researchers to scrut- 
inize their methodology with the intent of avoiding non- 
sampling error and has warned that the Presence of such er- 
ror in the data is likely to result in misleading or even incor. 
rect conclusions. l 

Intuitively, the problems of refusal bias and response bias 
become most serious as respondents are questioned about 


provide answers, This Paper discusses three randomized 


e practical application of these techniques by the 


matters they perceive as “sensitive” or whenever truthful 


answers may place the respondent in an unfavorable light. 
€ experiences of field researchers show that while con- 
troversial assertions Seem to elicit resistance from the inter- 
viewee, innocuous questions typically receive rather full 
Cooperation and truthful answers (13:63) 
When interviewee resistance to a question is anticipated 
(and, unfortunately, it often isn’t simply because its exist- 
ence is so “personally” defined), the usual strategy is to pro 
vide special training to sensitize the interviewer. Typically, 
the interviewer is taught to anticipate resistance, build a 
close rapport with the interviewee, create an open atmos- 
phere and a feeling of acceptance of ideas, and hear and see 
(body language) what the intervie 
(11:574-587; 4:537-543). Such training of an interviewer 
is not only Prohibitive in cost in most cases, but also que 
tionable in terms of the extent to which it actually reduces 
non-sampling error (8:161). There scems to be a natural J^ 
luctance on the part of most interviewees to confide mn 
feelings or facts to anyone—let alone a stranger—particulary 
if the responses to the questions are recorded on paper or 


"an Rh : ; iden- 
audio-visual lape with (or without) name and address ide 
tification. 


ae 
wee is “really saying 


rede ions which 
Researchers today are also finding that questions w É 
s : -hallenge 
demand answers “too revealing" can and will be challer ht 
z ; a “Ti 
both on ethical and practical grounds. Respect for a g 
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Lo privacy" is demanded in public surveys as elsewhere (14: 
884). 

The traditional role of the interviewer has also changed. 
Today, the interviewer is keenly aware that he does not have 
the confidentiality privileges of the lawyer, doctor, or priest, 
but yet has the responsibility not to betray the trust which 
is rightfully expected of him (5:520-521). Also, the inter- 
viewer may contribute to non-sampling error simply by his 
reluctance to ask sensitive questions and thus may omit 
questions (contributing to refusal bias) or alter the ques- 
Lions asked (contributing to response bias) (9). 

To reduce non-sampling error and at the same time pro- 
tect the anonymity of the interviewee and the confidential- 
ity of the interviewer, Warner (13) has devised an inter- 
viewing method called the randomized response technique 
which is based on answering probabilistically selected ques- 


Lions. 


The Randomized Response Technique 


In its original form, the randomized response tech- 
nt in the interview situation answer 


nique has the responde 
ithout revealing to 


one of two questions in a designed set w 
the interviewer which question has been answered. The 

pair of questions is so structured that each of the questions 
could receive the same classification of answers (c. g., both 

questions can be answered true/false, or yes/no); and, thus, 
the identity of the question is not revealed by the nature of 
the answer. Also, the two questions are worded such that 


the response is not necessarily incriminating. For example, 
the two questions might be of the form: (1) Are you a mem- 
ber of Group A?, and (2) Are you a member of Group not 
A? This type of question might be designed to gain informa- 
tion from a group of student respondents as to their per- 
sonal experience with homosexuality in a college dorm. The 
uestions might then be: (1) “Have you had at least 


two q 1 
in a college dorm during this 


one homosexual experience 
and (2) "Have you had no homosexual exper- 


past year?” B 
dorm during the past year? 


iences in a coilege 


Data Collection 


The respondent would be allowed to confidentially and 


randomly select one of the two 


confidentiality, only the answer he giy i 
hich question was answered. Note 


to either ques- 


questions to answer. For 
es is recorded—no 
indication is made as Lo w 
that a “true” or “false” answer is appropriate 
looking at the pair of ques- 


lion and thus the interviewee, 
acy even though he 


tions, believes he has retained his priv 
responds accurately to either of them. 
The randomizing process on choice 
trolled one, however. A typical procedure 
giving a spinner (or other randomizing device) to the res- 
pondent with the direction that he confidentially spin it and 
answer the question number to which it points. Tl 
mat be marked such that 70% of the time it would 
9n "or indicate that Question | should be answere 


of question is a con- 
would involve 


ve spinner 
] "land 
d and 30% 


of the time it would land on or indicate that Question 2 | 
should be answered. 

Given the proportion of time that the respondents will 
be directed by the spinner to answer Question 1 or Question || 
2 (and assuming the respondents cooperate and answer truth- | 
fully), simple probability can be used to estimate the actual 
number or percent of individuals belonging to Group A or, 
in the example above, those who have had at least one 
homosexual experience in a college dorm during the past 
year. Statistically, 


P(true) = P(randomly selecting Question 1 and 
answering “true”) + P(randomly 
selecting Question 2 and answering 
"true?) 


or 


P(true) = P(Q1 selected) x P(true/Q1 selected) 
+ P(Q2 selected) x P(true/Q2 selected) 


If we let P, = the probability that Q1 is selected, 
7,7 the probability that Q1 is answered 


“true,” 
then P(true) = P,7, + (1 — P,) (1 — 7). Because the P, valu 
of the randomization device is pre-set before the interview 
starts, 7, becomes the only parameter to be estimated. ` 

Suppose that for a randomly selected group of 400 stu- 
dents (sophomores, juniors, and seniors) from a predom- 
inantely “live-in” church-related school, the randomized 
response technique as described above resulted in a total 
of 176 of the 400 students (or 44%) responding “true” (to 
Questions 1 and 2). Assume that the randomizing mechan- 
ism was pre-set such that there was a 70% chance that a 
student would select (by having the pointer land on) Ques- 
tion 1 and a 30% chance that the student would select (by 
having the pointer land on) Question 2. Using the 44% 
total “true” response (or, .44) in this sample as an estimate 
of the proportion of the population that would answer, 
“true,” or P(true) = Phoe) = 44, and the 70% chance that 
a student would select Question 1 as the probability of 
selecting that question, or P, = .7, then 


P(trte) = P8, + (1 = Py) (155) 


Az T +3 (14) | 


Ad= Vm, *.3— 35, 

4-7 Añ 

7, 7.35 

proportion of students 


Therefore, the estimate of the true 
rience in a college dorm 


having at least one homosexual expe 
during the last year (t,) is 35%. 
Under the assumption that all 
ses are truthful, Warner (13:64-65) has show 
pected value, or zi, , of the proportion of “true” 
in the sample is the maximum likelihood estimate of the 
true population proportion T,- Also, because the variance 


“true” and “false” respon- 
n that the ex- 
" answers 


————<<—— HESSE 
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f the estimate is easily calculated? , the construction of con- 
f l in intervals and/or hypothesis testing is quite straight- 
s x using either the binomial distribution or, with the 

ory g 


| usual large sample sizes, the asymtotic normal approxima- 
1i © 


The actual result of dichotomizing the questions may be 
quite different, therefore, than that intended. While the E 
searcher sees the dichotomization of questions to mean tha 


neither a “true” or a false" response should be kii ined 
ing, the interviewee may feel that both are. The supposed 
psychological advantage of dichotomizing the questions 
} 
| 
EH i -values 
|| Selection of P-va 


. 3 
| tion. 


The rationale for using a randomized response tech- 
nique is based on the assumption that the procedure will 
elicit better cooperation (fewer refusals and/or fewer inten- 
tionally incorrect answers) from interviewees, If P, were 
set equal to 0 or 1, the whole randomizing process would 

degenerate into the traditional procedure of simply asking 
the sensitive question. At the other extreme, if Pi were set 
equal to .5, the interviewee would, in fact, be furnishing 
no information. For P-values set between 0 and .5 or be- 
tween .5 and 1 (exclusive), the interviewee is providing on- 
ly probabilistic information as to his group membership. As 
the P-values approach .5 from eith 
information is gained from the 
variances result. 

The question of the sample size required given a desired 
level of precision depends on the P-value selected. Presum- 
ably, the closer P is set to -5, the more likely respondents 
are Lo cooperate in answering the “pair” of questions. That 
is, the less information requested, the more likely the res- 
pondent is to cooperate. However, as the P-value approaches 
.5 from either direction, the variance of the estimate of the 
population parameter z( VAR 7) approaches a maximum. 
Thus, the real issue involves selecting a P-value close to 0 or 
1 so as to minimize the necessary sample size, yet far enou 
away from 0 or 1 to assure cooperation from the inte 

It is obvious that if all interviewees told the truth, the 
randomized response technique would require a larger 
sample han that required for the traditional approach, 
for any desired degree of precision in the estimate. However, 
the more important comparisons are between the random- 
ized response technique and traditional interview tech- 
niques under the realistic assumption that the traditional 
stimates are biased due to less than 100% truthful report- 
ing. Warner (13) has presented evidence to this point and 
has shown that with even minimal untruthful reporting, the 
randomized response technique can out-perform the 
traditional interview technique. 


er direction, less and less 
respondent as larger estimate 


gh 


'The Unrelated Question Randomized Response Technique 


The randomized response model, as outlined in the pre. 
vious section of this paper, involves an interviewee answer- 
ing one of two questions in a pair—the q uestions being re- 
lated to the extent that they are direct opposites of one . 
another. Field researchers (10) have suggested that this di- 
chotomization of the questions in the pair may actually be 
confusing to many interviewees in the sense that the second 
question may involve a double negative and thus may be p 
perceived to be of the form, “Heads, you win; tails, I lose. 


rviewee. 


may, then, actually be perceived as a disadvantage. 

To overcome such a limitation of Warner’s randomized 
response technique, the unrelated question randomized 
response technique was developed by Simmons.(5). The 
strategy of this technique is to have the respondent answer 
one of two randomly chosen questions, as he would in the 
previous technique, only this time the questions are unre- 
lated instead of dichotomized. For example, the two ques- 
Lions might be of the form: (1) Are you a member of 
Group A?, and (2) Are you a member of Group B?, where 
membership in Groups A and B are unrelated and member- 
ship in only one of the groups is stigmatizing. A specific 
question might be designed to gain information from a 
group of student respondents as to their personal experience 
with drug experimentation. The two questions might then 
be: (1) *Do you smoke marijuana at least once a week? " 
and (2) *Do you own a car?” 


Data Collection 


The respondent would be 
and randomly select one 


in the previously disc 


allowed to confidentially 
of the two questions Lo answer, i$ 
èd technique. Again, note that a 
yes” or “no” answe 5 appropriate to either question and 
the interviewee Presumably believes that he has retained 


his Privacy even though he responds accurately to the ques 
tion he chooses to answer. 


Again, also, a controlle 
selection mechanism will 
certain proportion of the 
mentary proportion, 

The supposed advantage 
proach is that the interview, 
two questions from which H 
However, a disadvantage 
“yes” responses are 
point of view, 

As initially deve 
ized response 


d randomizing device is used. The 
be set to indicate Question | a 
time and Question 2 the comple- 


of this unrelated question ap- 
ee can see more clearly that the 
he chooses are entirely different. 
with this approach is that only 
implicating from the interviewee's 


loped, the unrelated question random- 
technique involved two unknown parameters: 
the proportion of the population who were actually Group 
A members, and the proportion of the population who 

were actually Group B members, In the example here of J 
the questions on Marijuana smoking and car ownership, the 
two unknown parameters are: (1) the proportion of the a 
population who smoke marijuana at least once a week, an 
(2) the proportion of the population who actually own à 
car. The probability formula for this model would thus be: 


P(yes) = P(randomly selecting Question 1 and an- 
Swering yes?) + P(randomly select- 
ing Question 2 and answering “yes”) 
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or 


P(yes) = P(Q1 selected) x P(yes | Q1 selected) + 
P(Q2 selected) x P(yes | Q2 selected) 


If we let P, = the probability that Q1 is selected, and 
T, = the probability that Q1 is answered “yes,” 
and 
6, =the probability that Q2 is answered “yes,” 
then P(yes) = Pt, + (1-P,) 0,. Although the value for P, 
is known, values for both 2i and 0, are unknown. 

To solve for two unknowns, it is mathematically neces- 
sary to have at least two equations. Thus, it appears that 
the unrelated question randomized response technique 
would require that two samples be taken. The resulting set 
of simultaneous equations would be of the form? , 

P(yes) =P, 7 +(1P,) 9, for sample No. 1, and 

P(y$s) = Pf, +(1P,) 9, for sample No. 2 

The requirement of two samples is a serious drawback to 
using the unrelated question randomized response technique. 
However, it may be possible to select the unrelated question 
so that the probability of a “yes” response, or 0, , is known 
beforehand and thus need not be estimated. For example, 
the question set might be: (1) “Do you regularly ‘cheat’ 
when taking classroom exams?” and (2) “Were you born in 
the month of January?" 

Presumably, information regarding the proportion of 
births in the month of January would be casily attainable 
from census data. Thus, when the proportion of respondents 
answering “yes” to the unrelated or non -sensitive question 
can be determined a priori or exogenously, only one sam- 
ple will be needed to estimate the one unknown parameter. 

However, the parameter value secured from the census 
data, ete., is appropriate only so long as it is valid for the 
population under study (i. c., the target population). Asan 
example, the birth months of elementary and high school 
students may reflect recent trends toward “planned parent- 
hood” and/or “spaced childbirth” and thus may differ from 
patterns found in the United States population as a whole. 


Selection of P-values 

The purpose of Simmons’ unrelated question random- 
ized response technique was to secure better interviewee 
cooperation. That is, Simmons believed that randomizing 


it Warner’s d ichotomized ques- 


responses was good but the ; 
rviewee. Also, 


tion approach was often confusing to the inte 
to insure truthful reporting with Warner’s model, P-values 
close to .5 (low information content per interviewee) would 
presumably have to be used, whereas with clearly unrelated 
questions, P-values close to Oor | (high information content 
per interviewee) can be used. n 
The statistical implication of Simmons’ questioning strat- 
egy (which allows for P-values being close to 0 or 1) is that 


relatively : : 
latively small sample sizes can result in 


dictions, i. e, enberg et al. 


, small variance estimates. Gre 
neral conditions, 


E 
9 á a 
(5) has shown, in fact, that under rather ge 


relatively good pre- 


Simmons’ unrelated question technique will out-perform 
Warner’s model. This is generally true when only one sam- 
ple is needed (only one parameter is unknown), but may 
also be true under certain conditions when two samples are || 
needed (two parameters are unknown). | 


The Single Question/Random Answer Randomized 
Response Technique 


A third approach to the randomized response techni- 
que has been suggested by Greenberg et al. (5) Unlike either 
Warner's dishotómized questions or Simmons' unrelated 
questions, Greenberg's method involves only the single sen- 
sitive question. The randomness of response is gained by the 
opportunity for the interviewee to select from three pos- | 
sible answers. The question in this technique would be of | 
the same general type as in the other two techniques, i.e., | 
of the form: Are you a member of Group A?, where mem- 
bership in Group A is stigmatizing. A specific question | 
might be designed to gain information from a group of stu- | 
dent respondents as to their personal feelings about minor- 
ity student treatment on the school campus. The question 
might then be: "Do you feel that favoritism in terms of 
‘easy grades’ is given to minority students?" 


Data Collection 

The respondent would select an answer to the question 
based on the concealed outcome of some randomizing de- 
vice. For example, a spinner showing three area designa- 
tions—red, white, and blue—might be used. The interviewee 
would be instructed to answer honestly the sensitive ques- 
tion if the spinner showed red; answer “no” if the spinner 
showed white; and answer “yes” if the spinner showed 
blue. The probability model for this technique becomes, 


P(yes) = P(randomly selecting red—the sensitive 
question—and answering yes") + 
P(randomly selecting blue) 


or 


P(yes) = P(red selected) x P(yes | red selected) + 
P(blue selected) 


If we let P, the probability that red is selected, and 
P. the probability that blue is selected, and 
7,7 the probability that the sensitive question 
red—is answered “yes,” 


n" 


^ ^ € ne" g T 
then P(yes) = P; 7, 4 P,- Note that only “yes > answers to 


the question are incriminating, and that th ald bea 
similar drawback for the interviewee as would his “yes” — 
in the unrelated question technique. However, this 
for the researcher 
only one param- 


is would be a 


response 
question model has the advantage 


single i 
ly one sample since 


of always requiring on 
eter needs to be estimated (7). 


(0 — ll 
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Selection of P-values 


| The single question/random answer randomized re- 
| WERE AS npe riginally designed for the purpose of 
sponse technique was originally om t 
Spone i he shortcomings of Simmons unrelated 
overcoming one of the s S : 
š i the necessity to estimate (or know) 
question method, i. e., SE t 
the population proportion associated with the neutral ques- 
tion. And, to this end, the single question technique is ef- 
fective. " i 
However, interviewees may find the format of the single 
| question technique confusing. Similar to the concern asso- 
ciated with Warner's dichotomized question Lechnique, in- 
terviewees may believe the question to be a trick, The fact 
that the method, in two out of itst 
actually tells a respondent what type 
make it suspect. Thus, it appears that the initial issue in 
using the single question technique 
g gleq q 
viewee of the method’s honesty. Once this 


ple size requirements can be addressed. 

There appear to be two genera 
selection of P-values for Greenbe 
dom answer model: the P-values 
“white,” and “blue” must all be 
cause only “yes” responses can b 
bility associated with “blue? 


lizations concerning the 
rg’s single question/ran- 
associated with “red,” 
greater than 0; and be- 

e stigmatizing, the proba- 
"must be relatively high, 


Summary, Conclusion, and Implications 


Two types of non-sampling error in inte 


rviewee research 
are "refusal bias" and “response bias, 


" These errors often 
result from respondents giving no answer or à purposely in- 
correct answer Lo a "sensitive area? question, 

A randomized response Lechnique is useful to the inter- 
viewer and interviewee in assuring both persons that the 
answer made to the incriminating question is confidential 
and, thus, is more likely to be truthful. The dichotomized 
question variation by Warner allows the interviewee to 
respond to his “random” selection of one of a pair of a 
directly opposite questions. The unrelated question variation 
by 5immons allows the interviewee to repoRg to his “ran- 
dom” selection of one of two unrelated questions, The 
single question/random answer variation by Greenberg 
etal. allows the interviewee only the one incriminatin 
question, but a choice of three responses, In each of these 
cases, the selection of the question to answer (or the re- 
sponse to make) is accomplished with the help ofa random- 
izing device (e. g., a spinner, a box of marbles, a die). . 

The various randomized response techniques for obtain- 
ing answers to sensitive questions are intuitively appealing, 
For the most part, the theoretical-statistical issues have 
been resolved. The following questions of application and 
methodology, however, persist: , p , 

There are no rules or guidelines for identifying which 
technique variation should be used under what circum- 
stances. 


The critical issue of selecting optimal P-values has not 
been resolved. At best. a researcher can reccgnize a neces- 
sary tradeoff between obtaining interviewee cooperation 
(insuring anonymity) and working within situational con- 
straints of maximum sample size 


The “best” type of randomizing mechanism seems to be 
open to debate. To date, researchers have tried spinners, 
plastic boxes filled with colored balls, decks of cards, and 
random number tables with varying results. : 

The correlation between the understandability of a - 
method and the cooperation it achieves has not been studied. 
In other words, to what extent does the interviewee (and 
interviewer) have to understand the underlying randomizing 
features of the model to insure his (their) cooperation? Does 
the degree of sophistication associated with probability 
theory preclude the use of randomizing techniques or cer- 
tain audiences? Would it be nec 
the purpose 
viewee? 


sary to carefully explain 
i f izati "ach inter- 
and mechanism of randomization to each inte 


Perhaps further rese. 


arch will lead to definitive answers 
to these methodologic 


al or application questions. 


FOOTNOTES 


l. Note that when Pi 7.5, the equation is unsolvable. 


TH 
a 2- Warner (13) has shown that VAR 5, = 1/n [1/16(P, — 4) 
Ga = ¥%)?) 1 


3. It is Possible for 7 to take 
For example, suppose p. = -7 and 
The solution *quation becomes 


on values outside the 0 to 1 mE 
the sample produces P(true) = .25- 


35-4 +3 — 38, 
~ 05 = 47, 
7 = —.125 


^ 
Thus, T, takeson a negative value—mathematically correct, but 

meaningless in terms of a solution to the problem at hand, This re- 
sult (and similar Unusual ones) can happen (still assuming truthful 
Tesponses) whenever i 


vary to some (generally a if- 
i. e., when the actual Y chance alone, significantly dif- 
ferent from the expected P... (This occurrence is relatively unlikely, 
though, for even moderately large sample sizes.) i 

4. The subscripts 1 and 2 on the P-values denote the rando miz- 
ing Probabilities for Samples No, l and 2, respectively. Note that ^, 
must not be set the same as Pa, i.e., Py #P,. This is a necessary 
mathematical Condition for Solving simultaneous equations. 
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COMPREHENSION BY COLLEGE STUDENTS 


OF TIME-COMPRESSED LECTURES 


LORETTA ADELSON 
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ABSTRACT 


ted as an educational medium for sighted and blind students, 


with researchers reporting high 


Time-compressed speech has been sugges n e i 
5 p a For this study a repeated measurement design was used in presenting equated one-hour lec- 


comprehension of short, unrelated passages. 
tures to 200 students at 175 and 275 words pe t 
to analyze the data. Standard deviations were stable, and tim 


.01 level. The Fairbanks efficiency index is tl 
medium needs further study. 


TIME-COMPRESSED SPEECH reters to recorded 
speech which has been altered in time. 
"veritt, and Jaeger produced an electronic device 
was capable of picking up miltimeters of sound at a pre- 
determined rate, discarding those sounds, and then abut- 
ting the remaining sounds. The result was speech recorded 
at any desired rate without the loss of any one complete 


which 


In 1952, Fairbanks, 


r minute. Mean comprehension scores, l 
c-compressed comprehension scores showed losses that were significant a 


hus questioned, and it is further suggested that time-compresse: 


phoneme, continuity, or the 


ity (2). 


speech, Fairbanks, Guttman, 
tual passages, cach over 1500 words 
ported that at 282 wpm, 
than 90% “of the response at 141 wpm 


standard deviations, t-tests and percentiles were by 
t the 


d speech as an educational 


original pitch and vocal qual- 


-compressed 


In order to test comprehension of time 


and Miron composed two fac- 
in length. They re- 
“the response was slightly less 


(5:18). Encouraged 


a ll 


| 
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by such results, time-compressed speech was then given 
ss us consideration as a useful medium in the education 
prs m d as well as the sighted, and research in its use 
aei cider by the U: S. Office of Education (9, 10). 
rms whole, materials that have been used have been 
comparatively short in length. One researcher used pass ges 
| that ran for 18 to 20 minutes at the normal rate, 175 wpm 
| (10). More frequently, however, researchers have relied 
upon standardized listening comprehension tests (11, 12, 14, 
15). The use of such tests may be questioned since the tests 
are comprised of unrelated passages varying in length from 
25 seconds to 3 minutes and 45 seconds. Each passage is 
followed immediately by a written test of comprehension, 
In a critical evaluation of the STEP Listening Test Lorge 
wrote, “In terms of the objective: 
and educators, this makes for rel 
prehension situations. Among high school and college stu- 
dents the expectation should be for significantly longer 
listening time and for more questions” (13:572). The cus- 
tomary college lecture hour would appear to be a more 
realistic condition to use in the assessment of the listening 
comprehension of time-compressed speech. 

The purpose of this study was to examine the 
comprehension that was shown b 
dents when listening to a one 
materials at the normal rate 


s stated by the teachers 
atively short listening com- 


degree of 
Y a group of college stu- 
-hour lecture of educational 
of speech, 175 wpm, as com- 
pared with the degree of comprehension that was shown 
by the same group of college students when listening to an 
equated one-hour lecture at a time-compressed rate of 275 
wpm for 40 minutes.! 


Method 
Subjects 


Two hundred undergraduate students at Brooklyn Col- 
lege of the City University of New York volunteered for 
this study. All spoke English as their first language. Only 
those students who wrote the correct answers to all 21 
questions of the Harvard Psycho-Acoustic Laboratory Audi- 
tory Test No. 12 participated (3:490). 


Materials 


Six passages from English history, equated by Friedman 
and Orr, were used to comprise two one-hour lectures,” 
The passages had been equated for length, the average num- 
ber of words per sentence, the average number of syllables 
per 100 words, listening grade as determined by Rogers’ 
listenability formula, reading ease measured by the Flesch 
formula, the number of independent clauses, and the aver- 
age number of words not on the Dale list. Three Passages 
were recorded on one tape and constituted Lecture A, 
Three other passages were recorded on another tape and 
constituted Lecture B. Both lectures were prepared at the 
normal and time-compressed rates in order to satisfy the de- 
sign as shown in Figure 1. In order to familiarize the stu- 
dents with time-compressed speech, a two-minute selection 


from a seventh passage was taped so as to immediately pre- 
cede the time-compressed lecture. The lecture ran for 60 
minutes at the normal rate, and for 40 minutes at the time- 
compressed rate. 


Listening comprehension tests for the six passages had 
been devised by Friedman and Orr. The test for each lecture 
had, in effect, 75 five-option multiple choice items. Scores 
were corrected for chance and given to the nearest whole 
number. Comprehension tests were administered to 34 stu- 
dents without presentation of the lectures. The mean aud 
prehension scores for Lectures A and B were 6.44 and 3.71, 
respectively, and the average of these two mean scores is 
5.08. Raw scores only were 


used in the study. 

Listening materials were recorded on magnetic tape sup- 
plied by the American Institutes for Research. They were i 
presented free-field by the use of a Wollensak Model 1520 a 
3% speed in a semi-soundproofed room (16, 18). Intelligi- 
bility of the lectures was considered to be good. 


Design 


This was a repeated me; 


jects were compared with their own performances. The de- 
Sign used controlled for any differences in Lecture A and 
Lecture B difficulty levels (Figure 1). 


asurement study in which sub- 


SESSION 1 SESSION 2 


Lecture A - Normal Lecture B ,- Time-Comp- 


(n = 50) Followed by Test A 


Followed by Test B 


Lecture B - Time. 


-Comp. Lecture A - Normal 


Followed by Test B 


Followed by Test A 


Lecture B -~ Normal Lecture A - Time-ComP- 


Followed by Test B 


Followed by Test A 


IV Lecture A — Time-Comp. Lecture B - Normal 
(n = 50) Followed by Test A 


Followed by Test P 


Figure 1.—Experimental desi 
Lecture B difficulty levels 


ign used to control lecture A and 


Procedure 


After having performed a pilot study with 20 ee 
200 students were assigned consecutively and at random 
Groups I, II, II, and IV. To encourage a high level of pe 
formance, students were tol 
score in the u 
mendation. 


d that those who achieved p 
pper 10 percent would receive a letter of © 
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| 
Table 1.—Mean Comprehension Scores and Standard Deviations for Groups I, II, III, IV, and the Total Sample | 
| 


———— | 


—$— 


Passage 1 


Normal 


Time-Comp. 


Passage 2 


Passage 3 Total Lecture | 


Normal 


Time-Comp. 


Normal 


Time-Comp. 


Raw scores were used for all computations. A comparison 
of comprehension was effected by obtaining the mean scores 
and standard deviations under both conditions for the total 
sample and its four component groups, and carrying out t- 
tests in order to determine whether there were any signifi- 
cant effects between: the normal and time-compressed rate; 
the first and second order of presentation of the lectures; the 
form of Lecture A and the form of Lecture B. To determine 
the significance of the differences between the normal rate 
and time-compressed rate mean comprehension scores; for 
the total sample and all subgroups, t-tests were also used. 
Percentiles were computed manually for the normal rate 
scores. These were used as a reference normative base group 
for the time-compressed scores in order Lo observe the de- 
gree of separation between the two conditions. 


Discussion 


Scores and Standard Deviations 


The mean scores and standard deviations for the total 
sample and the four component groups are given by pas- 
sage and total lecture at both rates in Table 1. Of prime im- 
terest, the 200 subjects had a mean score of 24.89 at the 
normal rate, and a mean score of 16.66 at the time- 
compressed rate, showing a decrement of 8.23 points, or 
approximately one-third of the normal rate score. 


iati i sonsidered to be uni- 
The standard deviations may be considered a 
assage and from condition to condi- 


standard deviations given for 
terpreted as reflecting the 
due to random sampling 
ups. The larger differ- 
is of the total scores 


Mean Comprehension 


form from passage to P 
tion. Differences of the 
Groups I, II, III, and IV are in 
differences between the groups 
rather than having used matched gro 
ences between the standard deviatior 


and the passages reflects the increased length of the mate- 
rials and the resulting wider range of scores. 


t-tests 


In order to determine if there were significant differ- 
ences associated with the order of presentation (Session 1 
versus Session 2), the form of the lecture (Lecture A versus 
Lecture B), and the rate (Normal versus Time-Compressed), 
t-tests were computed for the pairs of conditions. Tables 2 
and 3 indicate that at the .05 level, for a two-tail test, there 
was no significant effect of order or form on the mean com- 
prehension scores for all four component groups. 

However, the effect of rate was statistically significant 
at the .01 level for three of the four conditions shown in 
Table 4. All four differences were in the same positive direc- 
tion. The differences in size of the t-values for Lecture A 
under both conditions, as compared with the t-values for 
Lecture B under both conditions, suggest that rate pro- 
duced a stronger effect on Lecture A than it did on Lecture 
B. This possibility is supported by the fact that as originally 
devised by Friedman and Orr, Lecture A contained ten ques- 
tions more than Lecture B. This was adjusted in scoring. 
However, it is plausible that the use of a larger number of 
questions was due to a larger number of ideas per listening 
time. It is well to note that greater density of ideas in Lec- 
ture A might have interfered with apprehension of the mate- 
rials at the time-compressed rate, but had no effect at the 
normal rate when there was adequate time for the process- 
ing of the concepts presented. i 

Also confirmed by t-tests was that the differences of the 
two comprehension scores were significant at the .01 level 
for the total sample and three of the four component 
groups (Table 5). The other component was significant at 
the .05 level. Differences are large and in the same positive 


direction. 
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la 


Mi Comprehension Scores and Significance of Differences when Lecture is Presented First Versus Second 
Table 2.—Mean 


A= Lecture A 
B= Lecture B 


C= Time-Comp Rate 


N = Normal Rate 
1= Presented First 
2= Presented Second 


*Not significant at .05 level for two-tail test 


A= Lecture A 

B= Lecture B 

C= Time-Comp Rate 
N= Normal Rate 

1= Presented First 
2= Presented Second 


"Not significant at .05 level for two-tail test 


Table 4.—Mean Comprehension Scores and Significance of Differences when Ni 


ormal Rate Versus Time-Compressed Rate is Presented 


Variable 
ou; 
Group A= Lecture A 
A 
1N 
I B= Lecture B 
Ale 
IV C= Time-Comp Rate 
N= " b 
2 Ass Normal Rate 
1-2 ^ 
— Age Presented First 
2= Presented Second 
III Bin "Significant at .01 level for two-tail test 
ea Bic 
Iv Bon 


ADELSON 
57 


able 5.—Me e! mt Impresse: and Significance o! erences fi G 
an Comprehension Scores for Normal and Ti All 
Tab! d Time-Co! d Rates and Signifi f Diff or IOUps 


Time-Comp Rate 
Mean 


Time-Comp Rate 
S.D. 


Normal Rate | Normal Rate 


All Subjects 


(N = 200) 
Shoes 4.24* 
zl debes 2.45** 
di de 3.05* 
4.16* 


IV (N = 50) 


1 level for a two-tail test 


*Significant at .0 


**Significant at .05 level for a two-tail test 


en Normal and Time-Compressed Quartiles 


Table 6.— Degree of Separation betwe 
P 


Rate P55 Psg 75 
Normal 13.95 23.50 34.59 
Time-Comp:. 8.78 | 14.63 23.88 


Wes 1 cede eee 


ur Component Groups, Showing Percentile 


es of All Subjects and the Fo! 


sed Conditions: 


ehension Scor 
and Time-Compres 


~Percentiles for the Mean Compr 
Difference between the Normal 


Table 7, 


Percentile* 


Time-Comp Rate 
Difference 


Mean 


Percentile* 


Normal Rate 


Mean percentile 


* ce normative base rouj 
Normal rate scores used as referen groups 


EEEE 
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Percentiles 


The use of percentiles afforded a clear view of the magni- 
tude of the separation between the two mean scores. Percen- 
tiles computed for the normal rate scores were used as the 
reference normative base group. Table 6 indicates that there 
was a difference between the two conditions that was almost 

equal to the semi-interquartile range. Thus, under time- 
compressed conditions, only about a quarter of the students 
did as well as the average student at the normal rate. 

Table 7 indicates that the normal rate mean comprehen- 
sion score for the total number of subjects was 24.89. 
Scores were positively skewed at the normal rate, so that 
this score was at the 55th percentile. The time-compressed 
mean comprehension score was 16.66. Scores were nega- 
tively skewed. This latter score fell at the 33rd percentile of 
the normal rate percentiles. Similar patterns were shown for 
all component groups. 

The foregoing divergent scores should be kept in mind 
when reviewing the degrees of comprehension established 
by previous studies. Fairbanks, Guttman, and Miron re- 
corded 90 percent as great comprehension for two Passages 
that ran a little over five minutes each at a rate of 282 wpm 
(5:18). Orr and Friedman reported, “The main result of 
the experiment indicated that naive (untrained) subjects 
suffer relatively little loss (0—20%) in listening comprehen- 
sion at speeds up to twice normal speaking rates (up to 325 
wpm)” (16:ii). Foulke et al. reported no significant loss in 
comprehension of a scientific selection presented to 29] 

blind sixth, seventh, and eighth grade pupils at 275 wpm 
(7). The selection ran for seven and one-half minutes and 
was at the fifth to sixth grade level of readability. Mc- 
Cracken found no difference in comprehension when she 

presented four short selections at 160 wpm and another 
four short selections at 320 wpm from the Diagnostic Read- 
ing Tests, Section II (14). The entire session, including test- 
ing, lasted approximately 30 minutes. It is apparent that 
when a realistic length of listening material was presented to 
sighted college students, considerably less was compre- 
hended at the normal rate. The length of uninterrupted lis- 
tening time appears to be a critical factor in its influence 
upon time-compressed listening comprehension. 

In light of the data given in this study, it would appear 
necessary to examine the efficiency index presented by 
Fairbanks (5) and used by Foulke el al. (7), Jester and 
Travers (12), and Sticht (18). Foulke and Sticht have ex- 
pressed Fairbanks’ efficiency index as a ratio between the 
comprehension score and the time used for presentation. 
They “found that learning efficiency increased as word rate 
was increased until a word rate of approximately 280 wpm 
was reached” (9:12). Foulke et al. concluded their study 
with the statement, “It was felt that those losses in compre- 
hension that were statistically significant were not all educa- 
tionally important, especially When the time saved in pre- 
senting the material was considered (7: 141). In view of the 
large loss in comprehension found in this study, it is impor- 
tant to point out that the efficiency index appears to rest on 


an underlying assumption that all items are equally easy OF 
difficult to learn, and that all items are equally important a 
unimportant to their educational objectives. Nor does it t2 
into consideration the density of the ideas presented. In ef- 
fect, as the efficiency index now operates, it tends to pula 
stamp of approval on those students who learned less, 
simply because they spent less time doing so. 


Conclusions 


This study has assessed the comprehension by 200 Brook 
lyn College students of a one-hour lecture at 175 wpm as 
compared with their comprehension of an equated time- 
compressed lecture at 275 wpm. The one-hour lecture 
period was chosen because it was considered to be more 
realistically representative of the college students? listening 
experiences than the considerably shorter periods that ai 1 
been used by researchers to date. A t-test established that U? 
difference between these two comprehension scores was 
statistically significant. Further t-tests for three criteria id 
ployed in this study—order of presentation, form of beehive 
used, and rate used—proved rate alone to be statistically 518 
nificant, Percentiles constructed using the normal rate com- 
prehension scores as a reference normative base group 
Showed that normal rate scores were positively skewed anc 
that time-compressed rate scores were negatively skewed. 
This large drop in comprehension differs with findings re- 
ported by previous researchers, 


The following conclusions are drawn: 


l. The 


length of the stimulus materials used appears t? 
be a critic 


al factor. Time-compressed materials suffered a 
proportionately larger loss of comprehension than did me 
normal rate materials when an educationally realistic 
length of materials was used. sed 
2. The efficiency index suggested by Fairbanks and Me 
by subsequent rescarchers has been questioned operation 2. 
because it fails to take the following factors into consider? 
tion: the density of the ideas present in the listening mate- 
rials; the number of items not learned; the importance 9 
the items learned and not learned; the relative difficulty iia 
the items learned and not learned; a criterion of acceptab , 
comprehension stated in advance. It is suggested that an E 
fective evaluation of an educational medium must take 
these factors into consideration. ify the 
3. The introduction of the efficiency index to justify 
use of time-compressed speech as an educational d 
has been further questioned because it puts a premium o 
learning less, provided that less time has been spent iue 
80; it shifts the educational emphasis from “how oeir 
more adequately” to “how to educate more groups” at 
it encourages skimming rather than probing and cre pe 
4. It is suggested that time-compressed speech mig 
of educational value when used to develop controlled, in 
graduated rates of Spoken practice materials for develop 
speed in stenography and stenotyping. 


Ls 
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5. Time-compressed speech may also serve a useful pur- 
pose in vigilance tasks which demand a choice among a 
limited number of alternatives, thus making the task one of 
discrimination or intelligibility rather than comprehension. 
: 6. Further studies that would concentrate on the estab- 
lishment of what might be considered to be an appropriate 
length of stimulus materials for a realistic testing situation 
at different educational levels would be valuable. 

7. Further studies that would contribute toward the 
establishment of criteria for the measurement of idea den- 
sity within test materials would help to make the interpreta- 
tion of comprehension scores more meaningful. 


FOOTNOTES 


g l. John B. Carroll (1:60) has suggested that syllables per 
aem is a better measure of rate. As noted under Materials, the 
th mber of syllables per 100 words was one of the criteria used in 

© equating of the lectures used for this study. 

Fu 2. The author is indebted to David B. Orr and Herbert L. 
tiedman of the American Institutes for Research, Silver Spring, 
aryland, who granted the use of their materials. 
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ABSTRACT 


The contribution of noninstructional activities to classroom 
The criterion for classroom e 


ion technique were use 


to 52 university faculty members. 
relation analysis and the multiple regress! 
was the failure of noninstructional activi 
criterion. A second significant result was g € 
Implications for further research are discussed, including 
to the community, and of education to society, may 
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and educati 
1 colleges and uni- 
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pressure 
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American 
n and services, 
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sulting, research, 


ing increasingly dependent on 
versities for current informatio 
personnel. That dependence has cre: 


on faculty to perform duties such as con 
and civic and academic committee work. College and uni- 


versity administrations encourage additional noninstruc: 
tional activities, such as student counseling, professional 
affiliation, and publishing; for their academic value. Edu- 
cational administration gives the following reasons for its 
encouragement, citing that n tructional activities: 
ast of his field; 
ion of highly re 


onins 


—keep a professor abre: 
—result in the introduct 
into classrooms; 
—stimulate a professor's 
—introduce the process o 
classroom (7). 


levant material 


desire to teach; and 
f systematic inquiry into the 


re administration’s 
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following 
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Classroom teachers do not always sha 
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m preparato 
f materials 


EN n š n; 
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teacher effectiveness was investigated by ad 


ffectiveness was a stu 
d to evaluate the results of 
tive value when student-perceivet 
t that time spent in consulting had upon cl 


the suggestion that a reevalua: 
be necessary. 


on are becom- 


ministering a questionnaire 
dent evaluation of each teacher. A simple cor- 
this study. A significant result of the study 
d teacher effectiveness is used as the 
assroom teacher effectiveness. 


tion of the responsibility of the university 


—leave professors unavailable for conferences regarding 
matters pertaining to course work; and 
_make only limited contributions to the quality of 


teaching (7). 

Each of the arguments against noninstructional activities 
is based on the assumption that instructional activities con- 
stitute the primary function of professors. It is that assump- 
tion which leads professors to question the contribution of 


noninstructional activities to college classroom teacher 


effectiveness. 
The issue of what individual qualities most enhance 
G 


teacher effectiveness remains largely unresolved. But 
attempts by education to identify influential qualities con- 
tinue. Today the influence of administration on faculty to 


enlarge the instructional base by involvement in non- 
ry real. In spite of administration 


instructional activities is ve 
insistence and faculty resistance in the matter, little, if any, 
evidence exists to prove or disprove the rationale for either 
case. 

Verdicts of some of the most comprehensive studies 
made to date are inconclusive. Prior to the statement of the 
on the Criteria of Teacher Effectiveness, Orville 
Brim (2) concluded that there were no consistent relations 
between teacher characteristics and effectiveness in teach- 
ing. In 1963, the Handbook of Research on Teaching (5) 


Committee 


————— 
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reported that "teaching methods do not seem to make 
much difference" and that "there is hardly any evidence to 
favor one method over another." Furthermore, it reported, 
“until very recently, the approach to the analysis of teacher- 
pupil and pupil-pupil interaction . . . has tended to be 
unrewarding and sterile.” After examining the data as well 
as the conclusions of nearly one hundred studies, Dubin 
and Taveggia (4) concluded that college teaching methods 
make no difference in student achievement as measured by 
final examinations on course content. 


Little encouragement is offered by the massive report of 
Equality of Educational Opportunity (3). According to 
that report, when the social background and attitudes of 
individual students and their schoolmates are held constant, 
achievement is only slightly related to school characteristics, 


Another very comprehensive study conducted by 
Stephens (9) supported Coleman’s findings. Documenting 
his position with the educational variables of school attend- 
ance, instructional television, independent study, corres- 
pondence study, class size, individual consultation and 
tutoring, counseling, student concentration, student involve- 
ment, amount of time spent in study, job distraction, extra- 
curricular activities, school size, supervisor-rated teacher 
quality, nongraded school, team teaching, ability grouping, 
progressivism vs. traditionalism, discussion vs. lecture, 
directive vs. nondirective teaching, variable testing, and 

programmed instruction, Stephens concluded that practi- 
cally nothing seems to make a difference in the effective- 
ness of instruction. 


Fortunately, subsequent studies give reason to question 
the pessimism. According to Bowles and Levin (1), the 
research design used in the Coleman study was “over- 
whelmingly biased in a direction that would dampen the 
importance of school characteristics.” For example, expen- 
diture was measured within an entire school district rather 
than within the given school in which the pupils were 
located. Hence, the expenditure-per-pupil was overstated 
for schools attended by lower-class students and under- 
stated for schools attended by students of higher social 
status. Faulty statistical models were also used, according 
to Bowles and Levin (1). The importance of a variable was 
measured by how much the proportion of variance in 
achievement explained was increased by adding that variable 
to the predictors. This procedure was followed without 
reference to the order in which the variables were added 
into the regression equation, and the order of the variables 
was such as to favor or overweight the family background 
characteristics. Despite the discrepancies in design, exclusive 
of family background characteristics, teacher characteristics 

accounted for higher variation than all other aspects of the 
school combined (3). In final reference to the Coleman 
study, Mood (8) has stated that “the present rudimentary 
state of our quantitative models does not permit us to 
disentangle the effects of home, school, and peers on stu- 
dent’s achievement.” 


Two studies offer encouragement and direction to the 
pessimism that surrounds research in teacher effectiveness. 
Mood (8) advises that one way to improve research is to 
obtain better measures, of a larger number, of the teacher 
attributes that are significant to the ability of teachers to 
improve learning. He says that such measures will come 
closer to estimating the full effect of teachers, independent 
of home and school factors. The second suggestion is to 
aim these measures at process variables, “those human 
actions which transform the raw materials of input into 
Opportunities for learning,” (Gagne, 6), i.e., . . . teacher 
activities, rather than teacher characteristics. It is to the con- 
tribution of those teacher activities that this study is directed. 


Method and Procedure 


Following an administrative push for an increase in the 
quantity and quality of research at Appalachian State Uni- 
versity, 1500 students and 52 faculty members participated 
in a study of the relevance of administrative policy. The 
purpose of this study was to determine if a relationship 
existed between college classroom teaching effectiveness 
and noninstructional activity. This resulted in an attempt to 
correlate student-perceived teacher effectiveness with self- 
reported noninstructional activity. 

In order to assess noninstructional activity, an instrument 
was developed to measure the degree of faculty involvement 
in such noninstructional activities as research, professional 
affiliation, committee work, articles published, conferences 
and workshops Sponsored or attended, books published, and 
professional consulting (Appendix A). After the faculty 
members had completed a noninstructional activity inventory» 
their students completed an inventory of teacher character- 
istics (Appendix B). That instrument was designed to elicit 
information on teacher effectiveness, the criterion variable. 

The two instruments provided a total of 11 variables for 
cach faculty member, from which followed a correlation 
analysis. This was intended to yield intercorrelations between 
noninstructional activity and teacher effectiveness. Finally, 


, an attempt was made to analyze sets of predictor variables 


. . 4 iay H 1 i i 5 
(noninstructional activities) in terms of their contaban 
. + ry s cacher 
to the prediction of the criterion, college classroom teach 
effectiveness, 
Ten noninstructional activity variables were used, as 


reported by the 52 sample faculty members. The variables 
are as follows: 


l+ (ACS) -The number of academic committees 
served 

2- (MPO) -The number of memberships in pro- 
fessional organizations 

3- (AWC) -The number of academic workshops 
and conferences attended 

4- (AAP) -The number of academic articles 
published 


(ABD) 
published 
(PCC) 


contracts 


meetings 
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- The number of academic books 
- The number of professional consulting 


(HrACS)- The average hours per week spent in 
academic committee meetings or in 
preparation for academic committee 


8- (HrPP) - The average hours per week spent in 
preparation of academic materials 
intended for publication 

9- (HrWC) - The average hours per week spent in 
academic workshops and conferences 
or in preparation for academic work- 
shops and conferences 


10- (HrPCC)- The average hours per week spent in 


ing contracts 


performance of professional consult- 


al classroom teacher 


Variable 11 (TchEff) rated the gener tea 
fic characteristics. 


effectiveness, without reference to speci 
No coding, rating, grading or scaling was done on the 
first ten variables, since the reported data represented 100 
Percent of the data. An examination of Table 1 reveals the 
tabulations of means and standard deviations for all of the 


variables, 
TI 
le average score was compute 


teacher effectiveness on cach sample 


| average was computed by dividing 


| zh For example, if five rc 
aculty member 3, 5, 7, 6, and 
Cumulative score on that variable 


Cumulative score was divided by the tot 
responses, 30/5, the result was an average score 


" iabl 
Table 1. Means and Standard Deviations for All Variables 
Standard 


d for student-perceived 
faculty member. The 


the cumula tive Scores 


On variable 11 by the total number of respons 
»spondents scored a 5 
9 on that variable, his 
would be 30. When his 
al number of 


es to that 
1 a sample 


of 6. On a 


Mean 


T ABS i 2.79 
2 MPO 2.56 
3 AWS 1.10 
4 AAP 0.04 
5 ABP 0.81 
6 PCC 1.49 
7 HrACS 4.40 
E 8 HrPP 1.32 
9 HrWC 1.68 
10 mHrPCC 7.29 
ll TchEff 


scale of 1 to 9, high scores represented effectiveness in 
reference to that variable. 

This study was designed to explore the contribution of 
noninstructional activity to college classroom teacher 
effectiveness. Although not specifically hypothesized, the 
issue of intercorrelations between noninstructional activity 
and classroom teacher characteristics was presented. The 
issue of intercorrelations among different noninstructional 
activities was also presented. Table 2 illustrates the inter- 
correlations of those data. The intercorrelation coefficients 
among the eleven variables are product-moment coefficients, 

The most notable feature of the full correlation analysis 
was the lack of correlation. Variable 11 (overall teacher 
effectiveness) was used as the criterion variable, the mea- 
sure of teacher effectiveness. Variable 10 measured the 
hours per week spent in performance of professional con- 
sulting contracts. As can be seen in Table 2, only variable 
10 offered any appreciable correlation with teacher effec- 
tiveness. The significance of this will be discussed in the 
regression analysis. 

The first ten of the eleven variables were items included 
in the noninstructional activity instrument that was com- 
pleted by the 52 sample faculty members. Items 7, 8, 9, 
and 10 duplicated the subject matter of items 1, 2, 3, 4, 5, 
and 6. The purpose of this was to measure the data as done 
by university administrations, then to measure the same 
data in units deemed more appropriate by the writers. 
Without reference to the credibility of either unit of mea- 
sure, Table 2 reveals low correlations between similar data 
measured in different units. Variables 1 and 7 (committee 
work), variables 3 and 9 (workshops and conferences), 
variables 4 and 8 (publications), variables 5 and 8 (publica- 
tions), and variables 6 and 10 (consulting contracts) illus- 
trate the low correlations, despite the similarity of data 


tested. 


Deviation 


1.90 1.29 


1.36 

2.66 

2.11 

0.19 

: 1437 

' 1.13 
5.81 

2.86 

5.25 

0.96 


ee een 
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Table 2.—Intercorrelations 


— 1 2 3 4 5 6 7 8 9 10 11 
ACS 1 00 13 27 05 09 25 55 -12 32 -05 09 
MPO 2 13  00- 41 -01 -04 09 00 -09 05 -04 28 
AWC 3 27 41 00 10 03 46 16 -11 27  -06 26 
AAP 4 05 -01 10 0 -06 11 00 46 00 -05 03 
ABP 5 09 -04 03 -06 00 32 09 12 08  -02 06 
PCC 6 25 09 46 11 32 00 36 03 19 06 16 

| HrACS 7 55 00 16 00 09 36 00 11 41 02  -11 
HrPP 8 -12 -09 -11 46 12 03 11 00 -1l -11 -05 
HrWC 9 32 05 27 00 08 19 41 -11 00 03 12 
HrPCC 10 -05 -04 -06 -05 -02 06 02 -11 03 00  -32 
TchEff 11 09 28 26 03 06 16 -11 -05 12 -32 00 


In addition to the simple correlation analysis, the inves- 
tigator used multiple linear regression to determine the 
unique contribution of proper sets of the predictor variables, 
1-10, to the production of the criterion, classroom teacher 
effectiveness. The contribution of a set of variables to 
prediction was measured by the difference between two 
squares of multiple correlation coefficients (RSs), one 
obtained for a regression model in which all predictors are 
used, called the full model (FM), and the other obtained 
for a regression equation in which the proper subset of 
variables under consideration have been deleted, called the 

restricted model (RM). The difference between the two 
RSs was tested for statistical significance with the variance 
ratio Lest. 

The unique contribution of a variable to the prediction 
of a criterion may be interpreted in several ways, one of 
which seemed most reasonable to the writers. If a variable 
was making a unique con tribution, then knowledge of that 
variable furnished information about the criterion. If a 
variable was making a unique contribution, then the respon- 
dents to the questionnaire, who were unlike on the variable 
but who were exactly alike or were matched on the other 
predictors, would differ on the criterion. 

It was desirable to group predictors into logical sels, 
subsets, sub-subsets, and so forth down to the individual 
The hierarchical grouping enabled the investigator 


variables es E z 
te unnecessary tests. Subjective analysis of the 


Jimina 
e in this study suggested that they formed a 
hierarchical pattern as shown in Table 3. 

A schematic was made to guide the sequence of tests; 
Table 4 illustrates the schematic. The topmost block, 
number 1, indicates that variables (1-10) were to be used 
ictors in the FM. The next two sets, blocks 2 and 
sent the numerically reported noninstructional 
reported noninstructional activity. 


as pred 
13, repre: 
activity and the hourl: 


Table 3.—Hierarchy of Variables 


_— ĖĖĖĖIIIIIIIIIaaIaaaaaaaaaaaaaaamŘaa 
—=—_——————————X—X—————X—X—X—— 


Numerically reported noninstructional activity: 


(1, 2, 3, 4, 5, 6) 


Nonremunerating involvement (1,2,3) 
Administrative (1) 
Professional Q,3) 

Membership (2) 
Participation (3) 

Remunerating involvement (4, 5, 6) 

Publishing (4, 5) 
Articles (4) 
Books (5) 

Consulting (6) 


Hourly reported noninstructional activity 
(7,8,9, 10) 


Nonremunerating involvement (,9) 
Administrative (7) 
Professional (9) 


Remunerating involvement (8, 10) 
Publishing (8) 
Consulting (10) 


Ee 
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Table 4.—. 
4.—Schematic for Regression Models 
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ported 
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the dotted line following Model 2 illustrates the conclusion 
of the analysis of variables (1, 2, 3, 4, 5, 6). 


Hypothesis 3 stated that there is no significant gain 
or loss in predictive efficiency when the hourly reported 
noninstructional activities are deleted from the full model. 
(See Table 7.) The restricted model, FM (7-10), which 
consisted of a model from which the hourly reported non- 
instructional activity had been deleted, had an RS equal 
to 0.11. The drop, 0.26 - 0.11 = 0.15, was statistically sig- 
nificant, indicating such activity was making a unique con- 
tribution to the prediction of the criterion. Further analysis 
seemed in order. 


As shown in Table 8, the RS of 0.21 for the next subset 
FM-(7, 9), which consisted of deleting the committee work 
variable and the conferences and workshops variable from 
the FM, indicated that these variables were making no con- 
tribution which could not be explained by the other eight 
variables of the FM. Thus, at this stage, it was concluded 
that further analysis of variables (7, 9) was unnecéssary. 

In Table 4, the dotted line following Model 14 shows that 
testing of further subsets of this set was terminated at this 
point. Whatever relationship existed between the criterion 
and these two variables could be explained by other pre- 
dictors. The investigator proceeded to test the collateral 
model, FM-(8, 10), Model 15, Table 9. The RS of 0.16 
suggested that the variables, publications, and consulting 
contracts should be analyzed further. The model for the 
data had an RS of 0.16, which was significantly less than 
the 0.26 for the full model. It was concluded that further 
analysis of variables (8, 10) was in order. 

Finally, when the variables (8, 10), publications and 
consulting contracts, were examined as shown in Table 10, 
the restricted model, Model 18, FM-(8), had an RS of 0.26. 
This amounted to no difference from the RS of 0.26 for 
the fuil model. It was concluded that publications made 
no contribution to the prediction of the criterion. 


As shown in Table 11, Model 19, FM (10), consulting 
contracts, had an RS of 0.17, which was significantly less 
than that of the FM. Thus, it was concluded that of all the 
hourly reported noninstructional activity variables, only the 
average hours per weck spent in performance of profes- 
sional consulting contracts make a unique contribution to 
the prediction of the criterion, college classroom teacher 


J 


effectiveness. 


Conclusions 

"Therefore, the following conclusions were drawn in 
regard to the three specific hypotheses of the study: 

Hypothesis 1, that there is no difference between the 
predictive efficiency of the full model and that of the zero 
model, was rejected. It seemed reasonable to conclude that 
whether a teacher will be effective can be predicted with a 
moderate degree of accuracy from knowledge of the non- 
instructional activity in which he participates. 


Hypothesis 2, that there is no significant gain or loss in 
predictive efficiency when the numerically reported non- 
instructional activities are deleted from the full model, was 
accepted. It seemed reasonable to conclude that whether a 
teacher will be effective cannot be predicted with accuracy 
from knowledge of the numerically reported noninstruc- 
tional activity in which he participates. 

Hypothesis 3, that there is no significant gain or loss in 
predictive efficiency when the hourly reported non- 
instructional activities are deleted from the full model, was 
rejected. It seemed reasonable to conclude that whether 
a teacher will be effective can be predicted with a moderate 
degree of accuracy from knowledge of the hourly reported 
Noninstructional activity in which he participates. Specific- 
ally, it was variable 10, hours per weck spent in perform- 
ance of consulting contracts which provided a unique con- 

tribution to prediction: increased involvement in consult- 
ing contracts having the effect of decreased college class- 
room teacher effectiveness. 

The question of whether significant intercorrelations 
exist between noninstructional activities and college class- 
room teacher effectiveness was answered in the negative. 
Only the negative influence of consulting deviated from that 
general conclusion. The predictor variables were selected 
because they were most often recommended by college 
administrators and because they represented reasonably 
accessible types of information which might have some pre- 
dictive relationship to teacher effectiveness. But the most 
notable feature of the data was their failure to correlate. 
The noninstructional activities had neither positive nor 
negative effects on teacher effectiveness. Even when the 
data were subjected to regression analysis in sets, the 
relationship between the noninstructional activities and 
teacher effectiveness was insignificant. As evidenced by the 
failure of data to correlate, and the failure of significant 
differences to appear between tested subsets, noninstruc- 


tional activity made no contribution to classroom teacher 
effectiveness, 


It would appear that the traditional predicting and evalu- 
ating standards of college classroom teacher effectiveness 
bear little or no relation to the subject. While a professor's 
noninstructional involvement may be an indicator of pro- 
fessional dedication, it is not an indicator of classroom effec 
tiveness. If classroom instruction is to remain the primary 
function of professors, the traditional noninstructional 
activity criterion of teacher effectiveness must be discon- 
tinued, particularly for purposes of hiring and promoting 
of faculty, 


The implications of these findings seem to offer two 
courses of action to college administrators. The first is for 
administration to search for new ways to help faculty _ 
increase classroom teacher effectiveness. The second is for 
administration to seek additional evidence in support of 
its claim of increased teacher effectiveness through involve- 
ment in noninstructional activity. These findings in no way 
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Table 5.—Regression Summary for Full Model Compared to the 


Zero Model 
RS (FM) = 0.2663 df (num) = 10 F = 1.28 
RS (RM) = 0.0000 df (den) = 41 PR = 0.1785 


. Table 6.— Regression Summary for Full Model Compared to 


0.2663 d£ (num) = 5 
0.1518 df (den) 
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RS (RM) 


Table 7.—Regression Summary for Full Model Compared to Hourly 


Reported Noninstructional Activity 


= F = 2.6 
RS (FM) = 0.2663 df (num) = 4 
Table 8.—Regression Summary for Full Model Compared to Hourly 
Reported Nonremunerating Involvement 
= = 1.51 
df (num) = 2 à 
RS (FM) = 0.2663 df jue zx uil: PR = 0.2321 


2 
RS (FM) = 0.2663 af (den) = 41 


RS (RM) = 0.1694 


i] 


RS (ww) = 0.2663 df (den 
RS (mw) = 0.1705 
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st that noninstructional activities are without redeem- Appendix B 
psc pe But a reevaluation of the responsibility of the 
Satis to the community, and of education to society, 
unr 


ssary justification of noninstructional University of Northern Colorado 
may Hemeceeenty oromi P Bureau of Research 


activities in general. Professional Inventory 


Appendix A "Compared to other courses 
iLY YOU JE THIS INFORMATION, SO PLEASE and other teachers, I would 
es Se : rate this course or this 
HELP! instructor—” 
You were randomly selected as one of sixty Appalachian 
State University faculty members to participate in a teacher Low Av, High 
effectiveness study. This research represents an attempt to vg» 
measure the contribution of non-teaching activities to over- l. ponse rei at aa stated, 1234 56 
* ollowed, attained. 
I classroom teacher effectiveness. i 
AL lapo : : 2. Assigned work is appropriate in 1234 56789 
The purpose of this instrument is to determine the total amount and level. 
amount of non-teaching activity in areas specified, in which 3. The fnaterials used (text, films, 1234 56789 
ASU faculty members participated, during the period of handouts, etc.) would rate 
December 1, 1971, to November 30, 1972, 4. Everything considered, I would 1234 56789 
rate the worth of this course 
i t ary to itemize the activities tome 
ecess z eS. 
ism À i i 5. The instructor’s genuine interest 1234 56789 
number of academic committees served in students 
number of memberships in professional organizations © 6. The instructor's communication 1234 56789 


skills—lecturing, questioning, 


number of academic workshops and conferences answering, discussing 


attended 7 


- The instructor’s professional 1234 56789 
" » " qualities- thorough knowledge 
— — number of academic articles published of the subject E B 
number of academic books published 8. The instructor's professional 1234 56789 
: E qualities—preparation for each 
—number of professional consulting contracts class 
average hours per week spent in academic committee 9. The instructor makes difficult 1234 56789 
meetings or in preparation for academic committee topics easy to understand 
. 10. The instructor identifies what 1234 56789 
meetings he considers important. 
average hours per week spent in preparation of ll. The instructor's personal 1234 56789 
academic materials intended for publication characteristics mannerisms 
and dress 


average hours per week spent in academic workshops 
"ET 


? ; R 12. The instructor's personal 1234 56789 
and conferences or in preparation for academic work- characteristics—is dynamic 
shops and conferences and energetic, enjoys teaching 


s i $a 789 
; ek spent in performance of pro- 13. The instructor's interpersonal 1234 56 
average hours pes WEE sp P P relationships with students— 
fessional consulting contracts fair, approachable, honest. 
TP 14. Ability to demonstrate skills 1234 56789 
If this instrument is to serve the purpose for which it was and techniques 
intended, it must be followed by a student evaluation of you. 1s, The instructor's interpersonal 1234 56789 
se indicate a class, hour, and room from which a sample relationships with students— 
Please inc ior to the conclusion of the has a sense of humor 
valuation may be drawn, prior to "lus e : 789 
evalu 16. The instructor's availability 1234 56 
current quarter. and Promptness—conferences and 
office hours 
class —_ hour — room number 17. The instructor encourages and 1234 56789 
pnm 


Provides time for questions 


: 1 and di i 
Thank you for more help than one has a right to ask of you. cussion, 
hank 


18. Everything considered, this 1234 56789 
instructor rates 9 
odis 8 
" 19. (Special item may be used by 1234 567 
RETURN TO: Ron McCullagh ; instructor.) 
Business Administration 20. (Special item may be used by 1234 56789 
Campus instructor.) 
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Appendix C 


Regression Summary For Full Model Compared To Numerically 


Reported Nonremunerating Involvement 
T———————————————————————————————ÀMááÁáÉ LLL 


df (num) = 
df (den) = 


RS (FM) - 
RS (RM) = 


Regression Summary For Full Model Compared To Numerically 


Reported Rem unerating Involvement 
Su Mitra ie iar d] pecia a ee eee, 


df (num) = 
df (den) = 


Regression Summary For Full Model Compared To Numerically 


Reported Time Spent in Administrative Involvement 
Reported Time Spent in Administrative Involvement __ _ _ ~ uuu 


df (num) = 4 
df (den) = 


RS (FM) = 0.2663 
RS (RM) = 0.2608 


Regression Summary For Full Model Compared To Numerically 
*Ported Time Spent in Professional Involvement 
= 1.45 


df (num) = 2 
df (den) = 0.2462 


I 
n 
H 


" 
P. 
= 
"d 
Ww 
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RS (FM) = 0.2663 
RS (RM) = 0.2146 


Regression Summary For Full Model Compared To Numerically 
“Ported Publishing 


df (num) = 
RS (FM) df (den) 


RS (RM) = 


v H if 
ReEression Summary For Full Model Compared To Numerically 
Ported Consulting 


df (num) = 
df (den) = 


Re icall 
Rrege: To Numerically 
Reporte” Summary For Full Model Compared n 
Time Spent in Professional Manage F = 2.16 
af (num) = 1 PR = 0.1459 


df (den) = 4l 
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Regression Summary For Full Model Compared To Numerically 
Reported Published Articles 
RS (FM) = 0.2663 df (num) = 1 F - 0.00 
RS (RM) = 0.2663 df (den) = 41 PR = 0.9614 
Regression Summary For Full Model Compared To Numerically 
Reported Published Books 
RS (FM) = 0.2663 df (num) = 1 F = 0.00 
RS (RM) = 0.2663 df (den) = 41 PR = 0.9583 
Regression Summary For Full Model Compared To Numerically 
Reported Administrative Involvement 
RS (FM) = 0.2663 df (num) = 1 F = 2.70 
RS (RM) = 0.2179 df (den) = 41 PR = 0.1040 


Regression Summary For Full Model Compared To Numerically 
Reported Professional Involvement 


0.2663 E 
= 1.15 
0.2901 


Regression Summary For Full Model Compared To Numerically 


Reported Professional Managership 
LL —MM— 


0.2663 
0.2665 


df (num) 
df (den) 


RS (FM) 
RS (RM) 
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2.16 
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SUPER-IMPOSED ORGANIZATION 


VERBAL LEARNING AND SELF- AND 
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ABSTRACT 


The study was designed to test the general hypothesis that the acquisition of information is greater when the learner is 
trained to use a self-imposed organizational system rather than one which has been super-imposed on him by others, The 
results supported the hypothesis with a significant main effect for type of organization in favor of self-imposed training. In 
addition, it was found that with regard to the acquisition of information, serial memorization is more effective than self- 
imposed or super-imposed organizational systems in which no training is given; self-imposed testing situations are superior 
for retention; and self-imposed learning tested in self-imposed testing situations is superior to all other combinations of 


organization and testing. 


THE PRIMARY PURPOSE of this research was to in- 
vestigate the short- and long-term retention effects resulting 
from training students to organize and to classify informa- 
tion to be acquired. 

A number of investigators have demonstrated that the 
retrieval of acquired information is enhanced by the or- 
ganizational system used in the process of acquisition 
(1:331-335; 3:40-48; 1). It has been demonstrated also that 
self-imposed organizational systems, i. e., ones developed 
by learners themselves and used in the process of acquiring 
information, are more productive than organizational sys- 
tems super-imposed on the learner (2:126-131). 

In the present study, it was assumed that with regard to 
the acquisition and retention of information: (1) the use of 
self-imposed organizational systems is superior to super- 
imposed systems; and (2) individuals can learn to improve 
the quality of an organization which they create. Based on 
these assumptions, the general hypothesis was tested that 
training students to organize information which they are to 
learn increases their learning productivity and efficiency. 


Method 
Subjects 

One hundred twenty eighth and ninth grade public school 
children of low-income families from a small mining com- 
munity in central Pennsylvania participated in the experi- 
ment. The IQ range of the Ss was from 90 to 106, with a 
mean IQ of 100 and a standard deviation of 3.14. 
Materials 

The task which produced the data for the experiment 


and which was required of all Ss was to learn a list of 25 


were 3 x 5 cards with a single word printed on each card. 
The words were common nouns or verbs, i. e., block, natures 
and sign, well known to the Ss and capable of being classi- 
fied into several different organizational systems. For ex- 
ample, “block” could be placed in the categories of wood 
product, plaything, obstruction; “sign” in categories such 
as writing behavior or symbols, etc. All 25 words, then, 
were such that if Ss were asked to place them into seven OF 
less categories, for example, several different classification? 
of categories were possible. Six words, in addition to the 
25 - for the various treatments, were used for a practice 
trial. 


Design 


The design included five treatment groups label jd: self- T" 
imposed organization (SI); super-imposed organization (sul » 
serial memorization (SM); controlled self-imposed organ- 
ization (CSI); and controlled super-imposed organization P 
(CSUP). All Ss were given the task of learning the list of 29 
words predetermined by E discussed under Materials. The 

Ss in the SI group received training in the use of their own 
self-imposed word organization. Those in the SUP grouP 
received training by using a set of E’s word categories for 
organization. The SM Ss received training in serial mem- 
orization. The amount of training necessary for each 5 

to reach a criterion of consistent replication of a single 
organization of words, whether it be self- or super-impo*t 5 
varied. To control for the varying amount of training acro” 


b 
words predetermined by E. The only materials necessary 


Ss, the CSI group experienced the average amount of the 
same training as the SI group; and the CSUP group receiv“ 
the average amount of the same training as the SUP grouP- 
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After training S: ith either f- or er-impose: gri d as 
g 9S with either a self- o superim d oup acted as a control for the training in 
saline. it was possible to test for the retention of the organization. Each S in this oup resented i the 
: » : g . gr was p ri 
earned emm a asking Ss to classify as many of them words in serial fashion for four trials, the average number 
y recall into either the organization. scheme of trials requi e and he S grou) O- 
as they could al t ther tl al quired for both the SI th UP 
groups t 


eii were trained or a different one. Thus it gether. 

possible to test for retention by usin ither E* ; 

es ede | y ; g ei her E’s or The average number of expo: iteri i 

rci aaa systems. A testing dimension of two levels, for the SI group was three; rehus pecu ea vo 

a (SIT) and super-imposed (SUPT), was in- exposed to the SI procedure three Sai The Ss in the "E 
in the design. The effect of retention was examined CSI group were asked to categorize the words any way 


byi : A s 
ane atg two levels in the design, short- and long-term they wished by using two to seven columns 
on. e 
The average number of exposures to criterion required 


"Ei S en assigned to all treatments ina 2x2x5 for the SUP group was five; therefore, each S in the CSUP 
lih oF with: two levels of retention, short and long; two received five exposures to the words. The Ss in the CSUP 
(SUPT); ee of test, self-imposed (SIT) and super-imposed group were asked to categorize the words by using E’s 
Cd i five levels of treatment, SI, SUP, SM, CSI, and organizational scheme. 
Sel : riginally, the experiment was carried out with 60 
we er, 60 additional Ss drawn from the same population 
Thus € available and the experiment was replicated. An ANOVA with method, type of test, type of retention, 
; the final analysis was completed using a replicated and replication was completed with the number of words 


desi Á s 
gn of two 2 x 2 x 5 factorials (6:391-394). correctly recalled as the dependent variable (see Table 1). 
many of the The main effect of method was significant (p<.01). A 


Results and Discussion 


28 in seen unt required Ss to recall as h 
Was pire possible. The measure of short-term retention Newman-Keuls post-test revealed that the groups using the 
of the ex ES pied following the — self-imposed method of organization (SI), super-imposed 
Panier permon procedure, and the measure o ong- organization (SUP), and serial memorization (SM) differed 
ention was administered after an interval of 48 significantly from both the self-imposed control group (CSI) 
d control group (CSUP), (P< .01). In- 


and the super-impose 
spection of the means (sce Table 2) indicates that the means 


Procedure of the CSI and the CSUP groups were lower than the other 
groups. 
In one practice trial, the Ss in the five experimental What is most interesting about these findings is that the 
Stoups experienced an example of what was required by SM group performed higher than either the CSI or CSUP 
Using the practice cards in a way appropriate for their ex- group, suggesting that the SI and SUP organization without 
Perimental condition. Following the example, questions of training may enhance recall less than SM, or that training 
may contribute more to recall than the organizational 


ho : : 
urs following the termination of the experimental pro- 
cedure, 


th 
ie E answered with respect to the procedure fol- P2 
e tes x theexpectmientl condition appropriate for " p M observed in Table 1 the main effect of tests 
eT maid. iu was implemented by using the was also significant (p< = d. eme E a testing 
he : i condition recalled more words (1 6-4 than the Ss using 
9n the iin: e IDEM p the SUPT testing condition (16.65). This observation sug- 
no less than a sai mean m columns. They gests that à super-imposed organizational system applied to 
= also ‘baad ni reine pem organization testing differs from a self-imposed organizational yem, , 
| i €y used on each trial p^ words were rearranged random- in that a super-imposed testing situation may inter -— 
: after each trial d ii sented again for the Ss to the recall of information acquired and thus produce a lower 
y eônize, These een experimental task until performance as happened here. m m 
€Y attained t er e Sapte The effects of the testing conditi m g : = 
The SUP n wo Successive iden a din naming cat- tion are apparent in the significant unserer (p<. ) 
cotes bn s were given by E five hea um d. On the observed between testing conditions and short- and 
Ist trj al o which the words were to be placed. E i bs gen ention. A Newman- Keuls post-test in — 
Ai : E showed the Ss which words wet "D a significant difference (p < .01) between the SIT and the 
‘ n but not short-term retention. 


S gain SUPT condition for long-tert 
5.93) was greater than 


er t 

tando, 1. Tespective category headings. : 
Wag "d chosen and placed in its category > and : cw The mean of the SIT condition (a d wasg , 
iain gapai pama Sem tengo 
w dto Put ^ nii with the cards ^ ies. All mistakes supports the notions that t 4 es s A t Sta SL aondítisn 
Ma. des. em into the proper categori k tame of learning situation and that the pres imo k ' dong 

cin ed at the end of the trial. Ss had the t2 : in testing where long-term retention is a consc eration 18 

t headings until two ped performance: 


8 
bin the words under the correc 


SSiy. 
e : 
> Successful trials were achieved. 
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Table 1.CANOVA of the Number of Correct Words Recalled 


————————————————————————— 
TM 


Source df MS F 
Treatment (Tr) 4 85.32 11.29** 
Test (T) i 72.07 9,55** 
Retention (R) d 385.20 50.95** 
Replication (Rep) 1 5.20 «1.00 
Tox 4 27.60 Su 6b** 
TrxR 4 12.15 1.61 
Tr x Rep. 4 14.15 1.87 
tat 1 33.08 4.38* 
T x Rep. 1 46.88 6.20 

R x Rep. 1 15.42 2.04 
Tr xTxR 4 11.43 1.51 
Tr x T x Rep. 4 5.56 «1.00 
Tr x R x Rep. 4 18.72 2.48 

T x R x Rep. 1 14.00 1.85 
Tr x T x R x Rep. 4 7.74 1.02 
Error 80 7.56 


**p«.01 


*p«.05 


Table 2.—Means and Standard Deviations for Treatment 
———— ÁÀ——— 


Trea tment Mean m 
—— ———— d ——————— 
SI 18.33 


MOORE, HAUCK, AND FURMAN 
T 


Table 3.—Means and Standard Deviations of the Treatment by 


Test Interaction 


Treatment 


SI SUP 


SD 


Test Mean SD Mean 


20.66 2.35 


SUPT 16.00 1.93 


The ANOVA also indicates that the main effect of 
retention was significant (p < .01), with the Ss in the short- 
term recall groups recalling a greater number (18.22) than 
bin in the long-term recall groups (14.65). This obser- 
vation is consistent with the literature concerning short- 
and long-term recall. 

i a robalily the most interesting finding was the s 
N eraction (p< .01) of treatment and type of test. The 
Se ie cs pore post-test revealed that SI Ss differed signif- 
1 any (p< .05) from all other groups under the student- 
initiated testing condition (sce Table 3). This observation 
hee with the findings of Tulving (5:219-237) and 
Mss ler and Pearlstone (2:126-131) who have demon- 
t ed that self-imposed organizational systems enhance 
d recall of informatior acquired. Further, it extends 
* research in this area by demonstrating that traimng - 
a organize information with the development of their 
^ " organizational system as a goal provides a more ef- 
"live retrieval system than either imposing organizational 
“ystems or simply encouraging but not training Ss to develop 


Ls organizational system. , 
e SM fact that the SI group differed from tn : 
tions s &roups only under the self-imposed ae . 
estin 'ipports the earlier suggestion that the cca i 
hte Rn may interfere wi g 
iveness System of the learner and thu 
9bservati S a retrieval system of acquire 
Stone re is similar to those made by ^ arit 
Se 3126-131) when they forced their Ss to 
t " ods or organizing word lists. 
oup veio acre note mat 
[9 ignificantly differen of the © 
them “nder the cei testing a wn 
icant an of the SUP was the greater: however; ^3 
p ifference occurred between the two for the : 
nm testing condition. A probable explanation e 
is ing, Tation is that the super-im esting © 
*rferes with recall to a great 


ignificant 


hat of the CSI 


Mean 


17.17 3.22 18.08 4.22 


17.75 3.38 16.83 3.98 


SH CSI CSUP 


SD 


SD Mean 


16.25 3:77 15.83 4.11 


14.00 4.29 


13.67 3.63 


conditions than under SUP conditions. An inspection of 
the means (see Table 3) supports this explanation. 

The importance of a self-imposed organizational system 
as a result of training is also emphasized by the fact that 
the SM group differed (p< .05) under the self-imposed 
testing conditions only from the SI group and the CSUP ( 
group, with the mean of the SM group being less than the 
mean of the SI group and greater than the mean of the 
CSUP group. 

In summary, the results of thi 
the hypotheses that student-initiated organ- 
both in the organizational and testing 
situations, contribute to the recall of information, and 
that the training of students to develop their own organiza- 
tional systems increases their effectiveness in the long- 
term recall of acquired information. Assuming the validity 
of these findings, it is recommended that classroom teach- 
ers provide greater emphasis on providing conditions 
necessary for their students to acquire a concept of organ- 
ization of information especially where long-term recall is 


the consideration. 


s experiment tend to 


support 
izational systems, 
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ABSTRACT 


A refinement of Hofmann's paradigm for conservation training is proposed. It is suggested that a delayed learning task 
be utilized for assessing generalization. This, it is argued, would also be a more preferable measure of whether or not new 


cognitive structures have been acquired. 


THE MOST DIFFICULT PROBLEM in assessing con- 
servation training is satisfying the criteria laid down by 
Piaget (2) to demonstrate that development rather than 
simple learning has occurred. As Hofmann (1) notes, this 
involves ascertaining that the change is stable over time, 
that it is dynamic in that it can lead to generalizations, 
and that more complex structures have been acquired. 

From an empirical point of view, the most difficult of 
these criteria to operationalize is that of testing for the 
acquisition of more complex structures. Within Hofmann's 
paradigm, the original pre-test is used to assess whether 
or not a cognitive structural change has occurred. However, 
as Hofmann clearly shows, it may be misleading to credit 
any improvement on this test to a change in the child's 
cognitive structures, since such improvement may simply 
be the result of an interaction between the pre-test and 
the training which has produced only rote learning. 

As a control for this, Hofmann incorporates a delayed 
post-test into his design to assess whether or not any im- 
provement on the immediate post-test is merely the re- 
sult of external reinforcement. 


Criterial Problems 

Two criteria must be satisfied for it to be suggested 
that more complex structures have been acquired. Hof- 
mann suggests that: (1) there should be a difference be- 
tween the delayed post-tests of the trained and nontrained 
(2) there should be no difference between the 
immediate and delayed post-tests of the trained group. 
Failure to meet these criteria, even though there may be 
differences on the immediate post-tests, 1s interpreted as a 


sible result of rote learning. 
x criteria, though valid, are not essential for sug- 
t more complex structures have been acquired. 
al that the trained group's performance be 
and delayed post-tests and 


groups; and 


gesting tha 
It is not critic 
the same on the immediate 


better on the delayed post-test than the control group. 
Any decrement shown by the trained group from the im- 
mediate to the delayed post-test, and any failure to main- 
tain a higher score than the control group might simply be 
the result of extraneous variables. 

One crucial uncontrolled variable might be memory. It is 
not possible to deduce on the basis of a decrement in per- 
formance on an immediate or on a delayed post-test that 
there is therefore an a priori lack of cognitive structures. 
The crucial question to be considered is whether or not 
material with the same underlying structure can be as- 
similated, that is, incorporated into any such structures if 
they exist, and not whether or not such material can be 
remembered. One is concerned with a qualitative difference 
in learning, not a quantitative difference in memory. On 
the basis of this distinction, one can suggest that if new 
Structures have been formed, the trained group should be 
better equipped to learn and assimilate such material. 


A Possible Solution 


A more preferable measure of whether or not any more 
complex structures have been acquired might be a learning 
task involving the same logical structures as those trying to. 
be induced in training. Performance on this learning task 
would effectively assess whether or not the training had 
produced any structural changes. However, one cannot use 
a learning task as a repcated-measures design for assessing 
pre- and post-test performance. It is necessary, then, to de- 
sign a learning task in dependent of the post-test but re- 
lated to the underlying structure of the training schedule- 

: One solution to this problem is designing the general- 
ization task such that it can be utilized as a learning €x- 
perience. In many situations this would be a more prefer 
able test of whether or not cognitive structures have been 
acquired. First, it recognizes that the timelag on immediat? 


and delayed post-tests allows other variables to interfere i 


MÜLLER : 


in the process, Second, it stresses the qualitative aspects 
of cognitive growth by assessing not simply whether 
material can be better remembered but, rather, whether 
new material can be learned or assimilated into the sub- 
Ject’s cognitive network. Furthermore, by delaying the giv- 
ing of this generalization task until after the sequiced 
time delay, one can assess independently of the delayed 
post-test whether the learning is stable over time. This 
design may be represented schematically as suggested by 
Hofmann.! 
Fem designed generalization task can provide data 
o all three questions posed by Piaget. It provides 
à measure of whether the learning can be generalized, and 
whether it is stable over time, and also enables a qualitative 
assessment of whether or not new cognitive structures have 
been acquired. 

Hofmann stated that he made no attempt to cover all 
the possible error routes or outcomes associated with con- 
Servation training. He did, however, outline a viable ex- 
perimental framework capable of being developed. A de- 
HM ru suggested in this paper is that for many cases 
€ = be preferable and sometimes essential to make bet- 
nd more extensive use of the generalization task in 
sn ng training procedures. This is the case particularly 
when the critical variable is the acquisition of cognitive 


THE EFFECTS OF CONTINU 
OLLEGE RELIGION COURSE 


INSTRUCTION IN A C 


structures. Hofmann’s paradigm has been developed to 
give more detailed consideration to the most difficult 
Piagetian criteria to operationalize. 


FOOTNOTE 


1. Utilizing the same formula and notations as H: 
(1:49) gives the following research design: senate os 


Ti] X i T[3] T[3]A 
TO] X T[2] (atleast —T[3]4' — TI3]' 
N[1] N[2] two weeks)  N[3] N[3]A 
N[1] N[2] N[3A'  NI3]' 


Thus, in the training conditions, the pre-test is followed by training 
and the post-test. After two weeks the delayed post-test is given 
first and then followed by the generalization task, or vice-versa. In 
a similar way, two control groups are formed. 
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ABSTRACT 


While a previ that requiring 

toi previous study has shown iini 

in Mproved transfer (as well as improved acquisition arg 
à less hierarchically related discipline, 1. © Religion, $ 


, 
er discipli , 
ences for transfer as a function of instructional 


lave cos int 
> More significant effects on transfer Ga ache 


3 i nm 
i Bion course indicate that acquisition 
Pretation or factual recall. 


he more 
ted in this 


Wut p 
eval MiL INSTRUCTIONAL DEV 
8M i lon has proliferated at the pre x 
E ath, SRA Science, Criterion : 
Ne to deyelop and evaluate instruction 


"Ee level, One at temp that hae h 


| procedure in Religion ! 


chically related discipline, i. e., Physics, leads 
wledge of prerequisites 


hether a greater kno t 
so i pa s orted here which show no dif- 


ntion), it is ethe: 
me roved transfer. The findings rep' e no 
the learning of prerequisites may, 


e findings for requ i 
urse include either analysis and 


» The Continuous Progress concept is derived 
cal considerations: (a) that there 
1 learning rate, and (b) that an 
an undesirable response occurs 
i »sired reinforcer is made contingent 
when the receipt of a desired e ige l 
h irrence of an undesirable response. In Continuous 
e occurTe: " Eae à 
ils made for individual differences 


Progress courses p 


Progress Plan. 
from two basic psychologi 
are individual differences ir 
increase in the probability of 


rovision is 


ge ey 
Vel » T 
“tional developed as part ol à T" «Continuous 
Improvement (4), i$ called the 
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in learning rate by allowing students to decide, within 

certain limits, when they will be evaluated. In addition, de- 
sirable progress and ultimate success in a course are en- 
sured by making them contingent on reaching criterion 
levels of mastery in succeeding units of material. 

Previous controlled experiments conducted to evaluate 
the Continuous Progress concept of instruction have found 
greater acquisition and more positive attitudes for Con- 
tinuous Progress students than for traditionally taught stu- 

dents in Psychology, Philosophy, and Biology courses (3) 
and in a Physics course (5). One purpose of the present 
study was to extend these findings to another discipline, 
Religion. It has been shown for Physics that there is greater 
transfer to a later Physics course and greater retention one 
year after original learning as a function of Continuous 
Progress instruction taught students in Physics (5). A 
question which is raised by the finding of greater transfer 
for Continuous Progress students is whether transfer for 
Continuous Progress would occur in a discipline less hier- 
archically related than Physics, i. e., Religion. While it 
would be expected from a theory of hierarchically related 
knowledge structures (2) that greater original learning in 
prerequisite courses facilitates transfer in later related 
courses (e. g., Physics), this should not be the case in less 
hierarchically related disciplines such as Religion. 

A final question posed by this study involved the types 
of educational objectives that can be more efficiently at- 
tained using Continuous Progress procedures. In previous 
studies, objective questions emphasizing either recall 
(Biology, Psychology, Philosophy) or problem solving 
(Physics) were employed. Can students also achieve analysis 
and interpretation objectives (measured through essay items) 
more easily in a Continuous Progress course? The second 
year of the Religion experiment reported here addressed 
itself to this question. 


Method 
Subjects 

A total of 46 students was evaluated—30 in 1968.69 
and 16 in 1969-70. Students were randomly assigned to 
Continuous Progress (CP) or Control (C) sections of the 
course. Mean verbal SAT scores were not significantly 
different from CP and C groups in 1968-69 (t = .14) or 
1969-70 (t = .31). 
Teaching Method 

Both C and CP sections were taught by one instructor 
in 1968-69, and another instructor taught both C and CP 
sections in 1969-70. Course textbooks and objectives 
were the same for both sections of the course each year, 
although the objectives for 1968-69 emphasized factual 
recall, while the objectives for 1969-70 included both re- 
call and high-order objectives such as comparison and eval- 
uation of different writers’ viewpoints. For example, test 
items for the 1968-69 course were of the following gen- 
eral nature: 
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Cicero applied the term “religo” to: 
a. national customs 
b. acelestial power 
c. family rites 
d. an inner attitude 
e. probable superstitions 


In 1969-70, however, test items included some com- . 
parative analysis, as examplified in the second half of this 
question: “What does Smith mean by ‘cumulative tradi- 
tion’? How is it related to faith?" 

Control Ss both years were instructed to read the ap- 
propriate text materials and attend three lecture-discussion 
sections each week. All C subjects took unit exams at the 
same predetermined time. CP subjects, while receiving the 
same assignments as C subjects, did not attend lectures, but 
instead were informed of the objectives of a unit of materi 
and instructed to come to the professor to take the exam 
when they thought they were ready. (The likelihood of 
"cheating? was minimized by having several alternate 
forms of the test available for individual students.) If a CP 
subject did not reach the satisfactory criterion (80% cor- 
rect) on his first attempt at a unit test, he reviewed his 
errors with the instructor, studied some more, and took an 


alternate form of the unit test when ready. This test-revie w- 


retest cycle was repeated, when necessary, a number of 
umes until Ss attained the 80% criterion. 


Evaluation 


In 1968-69 both C and CP groups took the same ten 
unit tests and the same final. In 1969-70 both groups took 
Six common unit tests and a common final. Comparisons 
of Cand CP groups’ scores on the unit tests and finals 
were used for the evaluation of acquisition. 

The grades received in the first Religion course taken 
after completion of the CP Religion course (or its corres- 
ponding traditional form in the case of the C groups) were 
used to evaluate transfer effects both years. A question- 
naire assessing student perception of the procedure was 
administered to the 1968-69 group in which the following 
questions were posed: 

l. Do you feel that a minimum level of achievement a8 
a requirement for proceeding to the next unit is a desirable 
requirement? 

2. Do you believe that the procedure for permitting 
a student to proceed at his own rate is desirable? 


Results and Discussion 
Acquisition 


The median score and interquartile range for unit tests 
(summated Over tests) and final exams each year are show? 
in Table 1. For 1968-69, both first and last attempts to 
reach a criterion score on unit tests are included for the 
CP group, while for 1969-70, the first attempt was not 
available for analysis. Also for 1969-70, the final two (© 
six) unit tests were not included in the summated score 
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Table 1.—Central Tendency and Variance of Unit Test and Final 


Exam Scores 


Score Group 


Median Interquartile Range 


a X 


1968-69 
First Attempt CP 500 26.00 
(summed over C 324 72.00 
units) 
Last Attempt CP 511 20.00 
(summed over 
units) 
Final Exam CP 70 6.25 
c 52 9.50 
1969-70 
Last Attempt cP 362 10.75 
(summed over Cc 317 67.63 
units) 
Final Exam CP 75 8.13 
C 70 13.38 
average going into the final, they could afford to do less 


Tor unit tests because two Ss in the C group dropped the 
Course after the fourth unit test, thus introducing bias. 

i For 1968-69, Mann-Whitney U tests revealed significant 
differences between the CP group's first attempt to reach 
criterion on the unit tests and the C group's unit tests 
pe dd df = 13, 23; p <.001), the CP group’s last 
attempt and the C group's unit tests (z = 4.92; df= 13, 23; 
P <.001), and the CP and C groups’ final exam scores. 

i = 3.63; df = 7, 23; p < .01). In each of these comparisons, 
€ CP group showed the higher performance. 

is a finding of higher performance even o7 the first test 

f ike the finding of higher performance on the first test 

Ps kei Biology, but unlike the finding on the first test for 

“chology (3) of no significant difference between Cand 
tio groups. In two of three courses; the intuitive — 
itis that CP students would not prepare for EM 
M but rather take the first test to find out the 

Ure of the test, is clearly unfounded. E. 

m 1969.70, Mann-Whitney U tests revealed significant 
nie between CP unit tests (last attempt) and ‘a 
Cp tests (U = 2; df =7,95P < .002), but not DM on 
bns C groups’ final exam scores. For unit pee the 

P showed higher performance than the C Lead ^ 
the Wo explanations for the finding of no saci on 
al exam seem plausible. First, two C subje 


in 
obt.: 
tained low grades on the first four t gae the 
an ip? While no CP students dropped. DEP they woul 
hay, , ton of C group scores relative to W@ din the 
Our een had the two poorer students remaine s 
Stug, |, Secondly, it is possible that CP subject. (cor- 
teet f ard for the final because of the assumplio? B 
br most of them) that since they had «bo^ 


than B level on the final and still get a B in the course. If 
this is the case, a different set of contingencies (e. g., 
weighting the final more heavily) could be constructed to 
maintain CP performance at a higher level even on the 
final. 

It is interesting to observe that the semi-interquartile 
range of the CP group (8.13) is considerably lower than 
that of the C group (13.38). TThis suggests that the treat- 
ment is effective in reducing variability. 

Since the 1969-70 course used essay exams, the results 
warrant the conclusion that Continuous Progress 
te acquisition of higher-order objectives 
such as analysis and comparison (1969-70) as well as recall 
(1968-69) objectives. This conclusion is limited, however, 
by the fact that inspection of the essay questions used in 
1969-70 revealed more recall questions than comparison- 
or analysis-type questions. Further research on the question 
hich may be attained efficiently 


of types of objectives w 
using CP procedures needs to be conducted where test items 
are carefully written to measure various types of learning. 


seem to 
procedures facilita 


Transfer 
A Mann-Whitney U test completed on grades received 

in the first Religion course taken following completion of 
the CP and C sections of the experimental course revealed 
no significant difference between C and CP groups (U - 90, 
ns). Thus, it appears that while CP procedures facilitate 
acquisition in a Religion course, they do not improve trans- 
fer. 


Since a previous study ( 
Physics for a CP group than for a C group, 


o 


5) found greater transfer in 
it is plausible 
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that CP procedures facilitate transfer only in courses in Summary and Conclusions 

which a strong hierarchical relation exists between the CP 
course and the more advanced course. That is, greater 
acquisition shown by CP subjects will facilitate acquisition 
at a more advanced point in the hierarchy. 


In summary, the findings of this study add to the gen- 
erality of the concept of Continuous Progress instruction 
by showing that (a) acquisition is improved in Religion, 
and (b) acquisition is improved for higher-order objectives- 

5 , The finding of no transfer effects for Religion tends to 
Students’ Perceptions support the argument that CP procedures facilitate trans- 


Mann-Whitney U tests (normal approximation) of the fer only in courses where long hierarchical relations exist. 


two questions regarding perceptions of CP procedures re- 
vealed no significant differences between groups. Both 
groups were neutral (z = 1.27, ns) toward the requirement 
of mastery, and positive (z - 1.57, ns) toward the procedure 


of allowing students to progress at their own rate. While the REFERENCES 
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ABSTRACT 


SUPPLYING EXPLICIT STATEMENTS of instructional 


Setting. If this assumption is valid, then it is reasonable t° 


objectives to learners is an integral Le of mastery learn- expect that when performance is compared between SS 
ing models of instruction C1, 2, 5). T A practice "ppears to given behavioral objectives (BOs) and Ss not given objec- 
be based on the assumption that objectives yall vedur the tives, those Possessing objectives should exhibit greater 
student's uncertainty about what is required of him, thus learning. As Duchastel and Merrill (6) demonstrated in. 
permitting the student to maximize learning by selectively their extensive review of objectives research, however; this 
attending to the most relevant stimuli in the instructional relationship has not been consistently observed. While 
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learner posessi 
lifts, ernie of objectives has been shown to facilitate 
has not been ne of studies, such a facilitating effect 
of such an effe served across all studies. The generalizabili f 
at this time, e therefore quite difficult to ovation - 
Appear in the Ji rthermore, serious methodological problems 
sible to Plise iere with such frequency that itis pos- 
: Although oe confidence in few of the studies. 
in the objecti zt are many methodological inadequacies 
the objective i literature, the ability of learners to use 
ical qüestion ur to them emerges as an especially crit- 
gested that ion research. Several investigators have sug 
elore effects ents need to know how to use objectives 
nly three batt learning will be present (3, 4 9, 12, 13). 
Port Blends have been found, however, wise 
p dmn Bund training learners to use objectives. 
"d S, although ee (4) attempted to train Ss to use 
uning, Furth ver assessed the effectiveness of the 
mene that tener concluded from anecdotal 
Sa no 
ested the effect Morse and illman (12) em- 
T orse and Tillman’ s of their training efforts. 
zeo Mager’s (11) pe 5 training consisted of having $ 
iss ompanying E eparing Instructional Objectives with 
di ?ndition were room instruction. Ss in a second train- 
SS, with no cl € directed to read Mager’s book out of 
bubo (control) assroom instruction provided. A third 
A was directed to perform an unrelated 


tudents 


one half of the Ss were 


In th 
n € s 
Biven cond part of the study, 
aining half 


BOs for , : 
m iare. reading and the rem 
han T Scores on yat Ss with objectives achieved 
icang 95 not pos items matched to those objectives 
ajo, ln effect d essing objectives. However; no signif 

ue to training and no significant inter- 
of BOs were 


foing oe 
u t betw an? 
een training and possession 


Wer 


n 
tain; Onsequ 
effect Was Kora Morse and Tillman concluded that 
^W Y in le necessary for students to use objectives 
sarning. 
confidence 


e fa 
whi acto; " 

! E M p . . 
"iq. May he one most seriously limits the 
aking Of the faced in this conclusion concerns the 
abou " edge tk taining procedures. Morse and Tillman 

i E . "n . 
S S Objective at Mager's book provides information 
d S, but docs not contain instruction in how 


tra; . ODjecti E 

xung is en in learning. Hence, the validity of the 
n estionable, Conclusions about the effects © 
lishing'@ strong 
d the required 


ehayi or. 
ate abl, ole is assumed to be a learning tool, it seems 
“Vey, * to Use students may require training boron” - 
ios ‘ie with maximum 77. . : 
ac ility to ligators have ignored the questio 
Will É tS ate s use objectives on the assumpto" 
Moy 4 edhe BOs they will use them, an 
m lon iin the investigator intended. E 
n the need for training students to use 0 
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Lese is prsiy available, the validity of the two as- 
€ P is is not known. However, if training is necessary 
ict oie nn de of Oma 
te k s important to inei 
een yea to use BOs is necessary, ee 
ing in the use of objectives and learner 
achievement was investigated in this study Specificall 
this hypothesis was tested: When ohjectives s : ided 
for acunit of instruction, Ss trained to use ei 
eve a significantly greater number of correct answers 


achi 
ng of items matched to the 


on an examination consistir 
objectives than Ss not so trained. 


Method 

Subjects 
Ss (N = 159) were undergraduate students enrolled in a 

survey Course of human communication theory at a major 

southern university. Ss varied extensively in the fields 

selected for their major(s) and minor(s). They were not in- | 

formed that a study was being conducted. 


Training 
During a previous te 
ved at least the gra 


rm students enrolled in the course 
de of ‘C’ completed a ques- 
hich asked them to identify the steps they went 
bjectives to study for course examin- 
self-reports, five steps in using objectives 


who achie 
tionnaire W! 
through in using o 
ations. From these 
e identified: 


wert 
]. Read the objective to identify where important 
material may be found. 
to locate specific passages related 


2. Read the material 


to the objective. 
3. Read the objectiv 


e to determine the form of the test 


item. 

4. Rea termine what you must be 

able to do to answer 
5. Ask yourself a 

l be asked on the tes 

hese five steps; the 


nts were derived: 
behavioral objective in which one part(s) of 


the part identifying where important 
is underlined, and five alternative 


s, the student will select the statement which 
ately describes why the underlined part(s) of the 


aluable when using BOs to learn. 
ehavioral objective; a reading passage di- 


numbered parts, and five alter- 
the student will select the 


the part(s) of the 


t relevant 


d the objective to de 
the test item correctly. 


question in a form similar to the one 


you wil t and try to answer it. 
^ Fromt following six objectives for 
training stude 

]. Given a 
the objective (e. go 
material can be found) 


statement 
most accur 
objective is v 

2. Given a b 
vided into five separate, 
es from which to choose, 


hich correctly identifies 


nativ 
hich contains information mos 


alternative W 
reading passage W 
to the pehavioral objective. 
3. Given a behavioral objective, a reading passage which 
rmation relevant to the objective, and asam- 
natched to the BO, the stu- 


contains info 
t item m 


ple multiple-¢ 


000 n 


hoice tes 
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dent will select from the five alternatives in the sample test 
item the correct answer to the item. 

4. Given a behavioral objective which has been divided 
into numbered parts, and five alternatives from which to 
choose, the student will select the alternative which cor- 
rectly identifies the part(s) of the BO specifying: (a) the 

form of the test item; (b) what the student must do to 
answer the test item correctly; and/or (c) where important 
material can be found. 

5. Given the five steps in analyzing/using behavioral 
objectives, and five alternatives from which to choose, the 
student will select the alternative which correctly: (a) iden- 
tifies the five steps in proper sequence from first to last, 
or (b) identifies the proper place of any step(s) in the se- 
quence. 

6. Given a behavioral objective and three sample test 
items, the student will select the best item(s) which is 
most closely matched (i. e., most appropriate) to the test 
item form specified by the behavioral objective. 


A 60-frame, branching type, instructional program was 
developed to teach Ss how to perform each behavior. Ex- 
amples of BOs and test items appearing in the program 
(as well as the training tests) were drawn from the various 
units which composed the course (i. e., intrapersonal; 
interpersonal; small groups; nonverbal; and mass media com- 
munication). The program underwent three separate re- 
visions on the basis of responses obtained in a pilot study. 
The validity of the training is supported because the be- 
haviors which the program was designed to teach were 
derived from strategies successful students reported em- 
ploying in using BOs to learn. 


Training Tests 

Four test forms were developed. Each test form con- 
tained at least two test items for each of the six objectives 
for the programmed instruction. The first form contained 
20 items and the remaining three forms each consisted of 
12 items. The minimum level for acceptable performance 
was set at 90% correct answers for each test form. 

The validity of the training tests was assessed by having 
six trained judges rate on a three-point scale the extent to 
which each of 21 items randomly selected from the four 
test forms corresponded to the objectives to which they 


were matched. Perfect correspondence between the 21 items 


and the matched objectives would be represented by a 
63. 1 ean of the summated scores 
ated score of 63. The mean of the sun | 
Tir Teris judges was 61.33. This value indicates high direct 
validity of the items. The inter-rater reliability of these 
ratings obtained by Ebel’s (7) analysis of variance procedure 


was .98. 
Reliability coefficients obtained by use of the Kuder- 


Richardson 20 for the four forms were .71, -63, .76 and .03.2 


Coefficients obtained by Livingston s (10) TS 
referenced procedure were .92, .69, 77 and -} í- 
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Unit Examination 


The dependent variable in the investigation was the num- 
ber of correct answers obtained by Ss on a 28-item multiple- 
choice (five alternative) test. The test consisted of two items 
for each of 14 objectives constructed for Kenneth Gergen s 
(8) Concept of Self. Objectives were written in the format 
of this sample: 


Given five alternative statements, the student will 

select the statement which most accurately illustrates 

or describes the concept of double bind (Chapter ID. 
The following test item was written to match the objective: 


Select the alternative which best illustrates the con- 
cept of double bind: 


A. Martha and Milton decide they want to eat dinner 
out Friday night. She wants Greek food and he wants 
Hungarian food. They cannot agree on where to go- 
B. Armando’s doctor tells him he has an ingrown nail 
that must be corrected now. Armando decides to wait 
until he can afford the expense. " 
C. After having her color television repaired, Debbie 
pays the serviceman but is not satisfied with the way 
the machine works. 

D. Gina needs nine hours to graduate. She cannot 
decide whether to take one 5- and one 4- hour 
course, or three 3-hour courses. 

E. Harold's wife Louise tells him often that she 
loves him, but frequently ruins his favorite meals 

by overcooking them. 


Six trained judges examined the test items and agreed 
unanimously that each item satisfied the specifications of 
the objective to which it was matched. 

The reliability of the scores obtained on the test was 
determined to be .86 using the norm-referenced Kuder- 
Richardson Procedure, and .92 using Livingston’s (10) 
criterion-referenced procedure. 


Procedure 


To clarify the description of the administration of the 
experimental treatments, the events which took place dur- 


ing each of the first four class sessions are discussed in the 
temporal sequence in which they occurred. 


First Class Session 


Ss were randomly assigned to either the training or 10 
training treatments, Within cach treatment condition, cach 
S was randomly assigned to one of three instructional 5€ 
Hons, Each section was supervised by two instructional 
assistants ([As). Three graduate students and nine under" 
graduates served as IAs. Each of the undergraduates had 
completed the course in the previous term with the grade 


of ‘A’. IAs were randomly assigned to the six instruction? 
sections. 
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Second Class Session 


wae ee to their assigned sections. Ss in the 
fae ae dee nre to receive training were informed of 
erie ment by the JAs, who also discussed general 
E rer procedures. The IAs then distributed Form I 
ines ning test. Ss were told that if they answered 90% 
ce. 2 zm correctly, they would not have to work 
ditional i ne programmed instruction nor take any ad- 
IBM a orms of the training test. Ss used machine scorable 
€ Seer sheets to record their answers. Upon complet- 
te thas Tu Ss returned their test copies and answer sheets 
"* im E Ss were told that attendance was required at 
ee E a meeting, at which time they would be in- 
cir performance on the test. 

- rapi ney the answer sheets were scored. Of the 

> completing the test, only three achieved the 90% 
Criterion, 

TAs in the three sections wh 
told Ss about general classroom proced 
that attendance for the next class sessio 


ich did not receive training 
ures and announced 


n was required. 


Third Class Session 
n Ss in the training sections were informed of their per- 
ormance on Form I of the training test. Ss who achieved 
the 90% criterion score were excused from class. Ss not 
reaching criterion were given the programmed instruction, 
directed to work through it and to then request Form II of 
the training test. 
Ne a S completed For : 
ie the test copy and his answer sheet to his IAs, who 
LU scored the answer sheet with a punched an- 
toth ey and completed a feedback sheet which wes gave 
s s S. The feedback sheet informed the S of the per- 
*nlage of correct answers which he had obtained. If his 
Performance was less than the 90% criterion level, the 
eedback sheet identified the BOs for the program which 
re ponds to the test items answered incorrectly. The 
cred sheet also identified frames in the program : 
ie contained information relevant to the unmastere 
Objectives. The [As then returned the S's copy of the 
Program and encouraged him to restudy it. This set of 
Procedures was repeated for Ss who failed to achieve the 
Criterion level for the third form of the training test. 
nly six Ss failed to achieve criterion on the third form, 
ut each of these was able to achieve the 90% level on the 
Ourth form. 
m the no-training con 
oms and received the 
bee were informed by their IAs th 
in ee needed their assistanc“ 
io à. Each S was given à booklet which co! 
ns for completing semantic differential sca 
© Nonverbal behavior of teac in classroom settings 
s RES required approximate s to comp m 
*re informed that by comP 


m II of the training test, he re- 


dition reported to their assigned 
following placebo treatment: 
ata graduate student in 
nce in conducting re- 
ntained direc- 
les related to 


Cla 


hers 
ly 45 minute € 
leting the task they satt 


= a requirement that they participate in an exper- 

Before leaving class, Ss in all sections were given copies 
of the BOs and required readings for the first unit. Ss not 
given training were told they could take the test for the 
first unit at the next class session, if they wished to do so. 
Ss receiving training were informed that they could take 
the examination for the first unit only after they had 
achieved criterion on the training test. 


Fourth Class Session 


[As for all sections were present in their assigned class- 
rooms to answer questions regarding the readings and ob- 
jectives for the unit. Ss were permitted to attempt the ex- 
amination. Testing was self-paced, i. e., a S took the test 
when he felt sufficiently prepared. To attempt the exam- 
ination, a S requested a test copy and answer sheet from 
his IAs. The IAs scored the answer sheet as soon as the test 


was completed. When scores were available for all Ss on the 


examination, the data were analyzed. 


Results 

A directional t- 
puted for the number 
two groups on the cou 
produced a significant t- valu 


indicating that the trained Ss 
porting the hypothesis. Table 1 summarizes the results 


analysis. 


test for independent data was com- | 
of correct answers achieved by the | 
rse examination. The t-test analysis 

e (t = 2.37; df = 157;p < 01), 
had higher scores, thus sup- 

of the 


Table 1.-Summary of Analysis 


Treatment 


Summary data 
No-Training 


Training 


Standard deviation 


Cell size 


Discussion 


een trained and 


differences in achievement betw 
the absolute 


ere statistically significant, 
he mean scores of the two groupe was 
Such a small effect attributable to 
training might possibly be accounted for in at least two 
ways. First, IAs answered all questions which Ss asked about 
the unit objectives and their relation to the unit test. Hence, 
gh untrained Ss did not receive formal instruction in 
re not denied information provided 
Second, although no data are 


Although 
untrained learners Wi 
difference between U 
less than two points. 


althou 
the use of BOs, they we 
informally about their use. 


o 
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available, it seems probable that Ss receiving training dis- 

cussed the instruction with some Ss not trained. Some un- 

trained Ss may also have discussed the use of BOs with 1 

friends who were enrolled for the course in previous terms. 
Since a significant postive effect for training was found 

despite conditions which might have mitigated the effect, 

it seems important that future investigations of learner 3. 

posession of objectives should account for learner com- 

petence in their use. If it is indicated that learners lack the 


basic ability to use objectives, then training should be 4. 
provided. 

Additional research is needed to increase knowledge re- 
garding: (1) what types of learning are facilitated by in- 5 


struction in the use of objectives; (2) what the nature of 


such instruction should be, i. e., the most effective way to 6 
use objectives in various types of learning; and (3) the most 
effective way of providing such instruction, e. g., program- d 
med instruction, lecture, small group discussion, etc. Re- 8 
search concerned with these and related problems should 
provide findings that will prove useful to both teachers and 9. 
instructional researchers. 
10. 
FOOTNOTES iu 
1. This article is based upon the Ph.D. dissertation of the first 12 
author entitled, *Effect of Training in the Use of Behavioral Ob- 
jectives and Knowledge of Results on Student Performance in a 
Mastery Learning Course in Speech Communication" (University 
Microfilms No. 74-6715). The dissertation was completed under 13. 


the direction of the second author. 

2. This exceptionally low reliability coefficient was attributed 
to the small amount of variance present in the scores of the six Ss 
completing the test. 


THE TYPOLOGY MODEL 


JOAN L. GREEN 
University of San Francisco 


ABSTRACT 


This report describes the investigators" experiences in the use of clu 
test the variations in membership and curricular preferences of subgro 
Students’ scores on Q-sort items describing preferences fo; 


THE IDEA OF curriculum planning and implementa- 
tion based on the identified curricular preferences of 
students has been explored by the authors in two research 


- Block, J. H., “Operating Procedures for Mastery Learning," in 


. Duchastel, P. C.; and Merrill, P. F., 


- Gergen, K., Concept of Self, Holt, Rinehart and Winston, 


- Livingston, S. A., “Criterion-Referenced Applications 


- Mager, R. F., Preparing Instructional Objectives, Fearon, 


- Morse, J. A.; and Tillman, M. H., 


ster and object-analysis techniques (BC-Tryon System) fo 
Ups of students within a given class over a three-year Pero’ 


student-faculty relationships were determined each year and mutmitted ton experiences, teaching methods and styles, and 


und that 


efforts and several publications (1, 2, 3, 4). A mode, : 
e planning in higher education, based on ie 
evaluative perceptions of students, was defined a9 th 
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set cluster analysis. A typology of subgroups» 
given the same population of students each year: 


| for 


est 


GREEN AND STONE " 


dent typology approach to curriculum implementation. 
The purposes of this report are to (1) discuss further de- 
velopments resulting from statistical experimentation with 
the model, and (2) present their implications for curriculum 
development. 


The Problem 


The typology model is an alternative to curriculum im- 
plementation which involves simultaneous approaches to 
hg teaching-learning process for a sequence of articulated 
ene within a subject matter field, such as might be 
or "e professional programs in nursing, engineering, 
sie ing, or any liberal arts major. The method suggests 
ee in which students’ curricular preferences for various 
a noli eds styles can be identified early and used as 
the Das instructional planning. In the typology model 
ed ividual perceptions of students for the most pref- 
aris least preferable features of the course sequence 

he LENA the administration of a Q-sort (5). 
Sinem tan the Q-sort are written to describe the more 
trol of a ta of the curriculum under the ace AA 

he items aculty, and reflect its philosophy and ok jec ives. 
cach of i within the Q-sort are arranged in categories, 
generaliy, hich represents a broader conceptualization. or 
adjust gano OF the program areas which the faculty oni 
of inde modify according to the curricular preferences 
Categories. and yet still achieve course objectives. D 
3 estes iar example, might be (1) program ams " M 
and (4) €s; (2) learning experiences; (3) teaching E en 
of the am of student-faculty relationships. T he aa 
is translat ents scores on the individual Q-sort I : 
€Bories aed into a composite score for each of the ca . 
"d ‘ere in a set of individual mean category o! 

Scores for each student (6). 

‘Mae students’ individual cluster en 
be devel alysis, a typology of subgroups of studer " 
fests a Oped. Each student member of a subgroup mar 
me MORAN of cluster scores similar to the other 
Ts E of the same subgroup, but different from the 
^i. e a of other subgroups. In other words, members 
“atures oi a HBrolg express the same nene m 

est, concern, or 
profiles of 


fury 


val, 2S Well he course sequence which they 

th "e to the as those which are of least inter 
Ln em. The more alike the cluster score 

ity ane members, the greater vill be the citi 

eg Veen in Un subgroups and the greater the ae 
M positi &oups. Definition of the nature, 


, Sg oups 
apts De and preferences of each of the 9 € E 
ap, Personn 3 : esigne 

ali, Onalized curriculum planning pis the same 


on - p 
of dub. S the preferences of students W! sttainment 
he Bene uch program planning optimizes m sera 

ral oh: efi 
poup, al objectives of the course sequence 
ump- 


ty 
: -— 
| "lag model is based on the following 45 


. 1. It is legitimate to plan the implementation of cur- 
riculums upon the preferences of students currently 
enrolled in a program. 
beatae cs cam m reis s composed of individ- 

s s ying preferences and priorities for the 
nme "Wero be so differentiated. 

E plementation based on 
the identified preferences of students maximizes the 
achievement of the aims of the program by all students 
and increases the likelihood of higher levels of student 
satisfaction. 

4. Curriculum planning and implementation based on 
the identified preferences of students provides for stim- 
ulation, creativity, and flexibility, thereby resulting in 
greater satisfaction on the part of the faculty. 

On the basis of the authors’ previous investigations, it 
was speculated that priorities of students identified early 
in a course sequence might be used to group students on 
the basis of their curriculum preferences throughout the 
remainder of the course sequence [See Figure 1, as orig- | 


inally published in (2).] 


Purpose of Program | - Decisions to be Made 
or course sequence | 


Y J 
Prescriptive Q-Sort early in the 
course sequence 

== a --4 
r Pre-Set Cluster Analysis | 
JL. e e 
ea ae 
T Object Analysis 1 
L——- D —— 
typology of Students 
Subgroup I ] Subgroup Il | Subgroup III | N 


Typology Subgroup el 


icular Plan for Each 


c | specific mainder Ta the program or course Sequence. 
i tation 
Curriculum Implemen 
Figure 1.- Typology Model for 


Questions to Be Investigated 


answered are: 

maintain the same 
riod of time they 
nce? 

persist through- . 
hough students 


ai d T 

» specific questions to 
T wile iven group of students Y 
ied ut the pe 


ho 
, structure throug : 
8 os course sequer 


in a pro} 
enrolled in a progr” <a 
e t 
ence even 
out the program o! ; 
curricular preferences 


3. Willa subgroup 


ge similarly? 
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The speculation that priorities of students identified 
early in a curricular experience would persist throughout 
the remainder of a program (thus facilitating group- 
oriented curricular planning) is based on observations 

made during a five-year curriculum evaluation project 
conducted at the University of San Francisco School of 
Nursing (2). 

In that study it was found that there was a tendency 
for curricular preferences to persist from year to year. 
This finding was based on cluster analysis of the 
students' scoring of Q-sort items. Some of the same 
clusters, describing a given class’s curricular preferences 
(i. e., Class of 1970), tended to reappear from year to 
year for a given class at each of the sophomore, junior, 
and senior levels. Other clusters emerged from the data 
analysis for each level of the curriculum (i. e., sophomore, 
junior, or senior), irrespective of the group of students, 
and were believed to be unique to a given level of the 
program since they represented the learning experiences 
characteristic of that level of the program. 

Thus, some clusters describing curricular preferences 
seemed to be associated with the "personality" of a given 
class, i. e., the Class of 1970, while other clusters seemed 
to be associated with the nature and character of a given 
level of the program. Students would assess the junior 
year of the curriculum in two ways: (1) simply because 
they were junior students evaluating the junior level of 
the program, and (2) simply because they were either the 
Class of 1970, 1971, or 1972. It was these findings that 
led to the development of the typology model and the 
refinement of the cluster technique so that data collected 
might be used for both curriculum implementation and 
curriculum evaluation. 


Rational Clusters 


The commitment to the development of a model for 
both curriculum implementation and evaluation led to the 
use of rational clusters. The rational approach provided 
for the inclusion of all Q-sort items in the clusters. This 
approach provides a basis from which to evaluate the total 
program or course sequence since the faculty can use a com- 
mon base for yearly review of program activities from the 
same frame of reference. This approach also appears to be 
a solid base from which to plan for personalized group 
instruction since the faculty can pre-plan alternatives to 
instruction and thus need only “plug in” the “right 
student subgroup” for each option once the students’ 
preferences have been identified. 

The authors were curious, therefore, whether a typology 
of students based on rational clusters also would persist 
from year to year, or if it would be necessary to define the 
typology each year that a given group of students was en- 
rolled. Since complete Q-sort data (from each of the 
sophomore, junior, and senior years) were available on two 
separate classes of University of San Francisco (USF) 


students (the Class of 1971 and the Class of 1972), those 
data were analyzed further in order to test the hypothesis 
that curricular preferences of students did indeed prevail 
from year to year when using the rational clusters as the 
basis for organization. It was expected that either ac- 
ceptance of or failure to reject the hypothesis would lead 
to further refinement of the typology model and increase 
its usefulness and generalizability for other educational 
programs. It was recognized that an important limitation 
to the results of this analysis would be the fact that the 
findings would be based on the retrospective perceptions 
of two classes of students evaluating common curricular 
experiences. The typology model has not yet been sub- 
jected to testing through simultaneous approaches to 
curriculum implementation based on the identified needs 
of subgroups of students. Such experimentation may W^ 
alter again the patterns, membership, and maintenance o 
typology constructs. 


Related Research 


Nine clusters were used in the initial research serving ps 
the background for this report. All 72 items of the Q-sort 
designed to obtain student perceptions of the curriculum 
were assigned to the most appropriate cluster. The cluster 
were seen as the areas of the USF program over which the 
faculty had control and were areas in which students 
would tend to differ in their preferences. They were the 
generalizable areas of the curriculum and were pertinent 
to all three levels (sophomore, junior, and senior) of the 
professional component of the program, independent o 
specific subject matter content. f 

Based on the content of the items contained in each 0 
the clusters, two descriptive statements were written to f 
accompany each cluster. The first statement described the 
kinds of learnings, methods of teaching styles and pro- 
cedures, evaluation techniques, and/or nature of faculty- 
student relationships which would be favored as high- 
pnosity ar highly preferred characteristics of the student? 
who would score high on that cluster. The second state 
ment defined the areas of disinterest or low priority 9 
students who would score low on that cluster. Titles © 
each of the clusters and the characteristic high-scoring 


and low-scoring statements describing them follow a: 
51-57): 


Cluster I: 


Program Objectives Conducive to Professional 
Attitudes and Understandings (12 items) 


High scorers on this cluster are in accord with th 
of the professionally oriented objectives of the program 
express the highest degree of satisfaction with learning 
experiences designed especially to effect those goals- 
Subjects agree that significant curricular experience? 
Should be characterized by learnings which would: 

(1) help students understand concepts of economic? 


e value | 
an 
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z relate to comprehensive and continuous health care; 
d ind Sonja S expe administer health care 
patients and their cotta de Selene sie 
their professional obligati i i i unido se 
sow gw , o! igation to utilize research findings, 
idi nge and continue their own education; and 
a evelop skills of effective communication and inter- 
ion with patients, families, and all levels of health 
workers. 
BR se on this cluster are not committed to the 
etna ve y oriented objectives of the program and 
ler ant : tle value to learning experiences stressing the 
Falsa. omer of the professional practitioner. Sub- 
o be dissatisfied with the kinds of learning 


situati 
tuations valued by high scorers. 


muse II: 
et Pon Objectives Conducive to the Development 
oblem Solving Skills (10 items) 


of "i SCORES on this cluster are in accord with 
ities <n Objectives which guide the learning opportun- 
in ale hasizing development of professional expertise 
significa ing health care problems. Subjects agree that 
nd curricular experiences would be defined by 
functio re to gain confidence in their ability to 
and dees : fectively in all settings with patients of all ages 
necessa, ir families. Subjects agree further that the skills 
tain hi Do them to assist others to achieve and main- 
iences k : — health require specific laboratory exper- 
rege i (1) establishing therapeutic relationships: 
independe coping measures used in crises; (3) making 
resource ent judgments; (4) initiating change; (5) beri 
Essential personnel; (6) making referrals; and (7) teaching 
S of health care. 
ial on this clustesr are not committed ia the 
emphasi, earning opportunities in the program nee 
tend to 8 professional problem-solving skills. Subjec A 
duirine à e less importance on laboratory experiene a 
Peteng E em to develop such skill or demonstrate co 
Y in the behaviors valued by the high scorers. 


the value 


Valu 


Ch 
A Uster I: 


an, 
Progra, ement of Learning Opportunities Fea 
i (8 items) 


tured in the 
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designed to pM 
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ve on to more 
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SUbjegp "tial lab ; ovide 
oratory experiences 
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[y Oh Paynes 
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Clive ngs as they demonstrat ing persons 


8. Su; ; 
ma ubjects agree that settingë tural back- 
Sw ety of diverse social classes 2? 
h to d provide the most ideal mo 
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Pro, E ca Serve, plan, initiate and admins : 
tedy © gain skill i ing a variety © 
ures n skill in perform : 
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stances in 
r definitive 
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rd with the value 


Low scorers on this cluster are less concerned with the 
arrangement of their learning experiences in an orderl 
progression and do not agree that diversity in the ida 
tional setting is essential to the mastery of learning ob- 
jectives. 


Cluster IV: 
Role of Group Learning and Instruction in the Program 


(6 items) 


High Scorers on this cluster are in accord with the value 
of didactical methods characterized by the group process. 
Subjects agree that their learning experiences should feature 
group projects and conferences, team teaching, and section 
and seminar meetings so as to benefit from the learnings of | 
their peers and the diverse experiences and interests of the 
faculty. They concur that group conferences before and 
after laboratory experiences provide valuable opportunities 
mmunicate learning needs and objectives, clarify the- 


to coi 
rsing practice, and to 


oretical concepts fundamental to nu 
otherwise prepare for the laboratory itself. 

Low scorers on this cluster doubt that group learning 
and teaching can expedite their own achievement and 
question the contribution of instruction employing these 


methods. 


Cluster V: 
Characteristics of Exemplary Fac 


(8 items) 


High scorers 


of nurse faculty members w 
role models. Subjects agree that the ideal instructor keeps 


up with changes in the practice of nursing and is herself 
a competent practitioner. As educators, subjects value 
faculty members who respect students as adults and forth- 
coming professionals, are prudent in their discussion of 
matters relating to students, initiate opportunities for 
exchange of ideas between students and the faculty, and 
air differences of opinion between and among the students 
and faculty openly and rationally. Subjects agree further 
that such collegiality is demonstrated by willingness of 
faculty members to participate in student-initiated social 


activities. 
Low scorers on this cluster doubt exer 
of faculty members assume à vital function in their 
Subjects are skeptical that col- 


ation. 
between the students and faculty have 


he learning process: 


ulty Members 


on this cluster are in accord with the value 
ho can serve as professional 


that the exemplary 


roles 
rofessional educ 


legial interactions 


a positive influence on t 


luster VI: l 
et Behaviors Which Organize Instruction 
(8 items) 

High scorers on i 
of learning experiences whi 
and directed by the faculty. 


accord with the value 


med, structured, 
hat new learn- 


this cluster are in 
ch are plar 
Subjects agree t 
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ing experiences, whether clinical or theoretical, should 
reflect a high degree of faculty intervention to identify 

the general and specific needs and interests of students and 
to plan accordingly. 

Low scorers on this cluster minimize the necessity for 
the faculty to organize and intervene in their learning 
activities or to determine subjects’ levels of readiness for 
new experiences. These subjects do not rely on faculty 
members to identify or assist in the transfer of their 
learnings nor are they concerned that the faculty heed stu- 
dent opinions about curricular experiences. 


Cluster VII: 
Faculty Behaviors Which Individualize Instruction 
(7 items) 


High scorers on this cluster are in accord with the value 
of formulating their own learning objectives and selecting 
the most appropriate experiences by which to achieve 
them. Subjects agree that in recognizing and honoring 
student-selected objectives and experiences, the faculty 
should assist students to choose realistic plans capable 
of achieving success while simultaneously encouraging 
alternative approaches and supporting student decisions 
which might be contrary to those faculty members would 
make in similar situations. 

Low scorers on this cluster do not recommend student- 
selected or highly individualized learning experiences. 
They reject the notion of alternate approaches to solving 
nursing problems and agree that the faculty rather than 
the students should take the initiative in planning a lab- 
oratory experience. 


Cluster VIII: 
Faculty Behaviors Which Evaluate Learning and Progress 
(6 items) 

High scorers on this cluster are in accord with the value 
of evaluation procedures which clarify learning needs 
and formulate new objectives. Subjects agree that the 
evaluation process should be individualized in terms of 
the abilities, interests, and previous experiences of the 
students and reflect both students’ own self-appraisals and 
the faculty's consideration of external factors influencing 
student learning and progress. Subjects agree further than 
in evaluating student achievement, faculty members should 
expect no more of students than they would of themselves 
in similar situations. 

Low scorers on this cluster doubt that evaluation plays 
a prominent role in their learning. They question the 
likelihood that faculty members individualize the eval- 
uation process. Subjects do not agree that the faculty 
should consider students’ self-evaluation nor that evaluation 
conferences are used to point out areas for improvement. 


Cluster [X: i 
Faculty Behaviors Which Support and Encourage Students 


(7 items) 


High scorers on this cluster are in accord with the im- 
portance of an enthusiastic faculty who make learning 
come alive for students. Subjects agree that such faculty 
are readily available to students for consultation and 
support and manifest interest in students as individuals 
by: (1) being sensitive to needs for repetition and rein- 
forcement of learning; (2) providing positive feedback 
related to student progress and achievement; and (3) com- 
municating empathy for the problems encountered in 
learning. Subjects agree further that supportive faculty 
members are discreet in their concern for the personal 
difficulties of students and able to make appropriate 
referrals for assistance with those difficulties both tact- 
fully and helpfully. 

Low scorers on this cluster are less demanding in 
expectations of the faculty to provide support and en- 
couragement. They do not value the ability of faculty ü 
members to identify with the learning problems of studen 
or to be aware of the personal difficulties of students 
which influence learning and thus require assistance. 
These subjects are independent of the need for reassurar 
regarding their achievements in the nursing program. 


their 


nce 


Methodology and Findings 

Using’ the object-analysis procedures described in 
earlier reports, typologies were constructed for each of 
the two classes at all three levels (1:41-42). Examination 
of the findings revealed that there were differences in the 
typologies created for the Classes of 1971 and 1972, a 
that there were differences within the same class at each 


Torte LANE Cluster Scores and Standard pep "1 : 
urricular Preference Typology Subgroups I and I, Cla 
of 1972 as Sophomores - ^ 


—M————————— 


Subgroup I Subgroup II 
N= 12 mo. 

Cluster Mean S.D. Mean ED 
1 40.90 9.11 52.61 6.86 
2 38.18 5.53 53.44 75 
3 47.27 7.20 50.71 87 
4 62.44 6.13 45.76 8.04 
5 51.30 7,34 464p T 
6 54.46 8.24 4914 869 
7 4.42. 4,08 — s19 69 
8 55.32 — 4.80 50.08 79 
7.94 


3 60.00 — 7.59 46.56 


GREEN AND STONE 2 


of the three levels. The findings for the Class of 1972 ogy subgroups for the Class of 1972 as sophomores are 
are reported in this article. The mean cluster scores and presented in Table 1, for the Class of 1972 as juniors in 
standard deviations for the curricular preference typol- Table 2, and for the Class of 1972 as seniors in Table 3. 


Fui 2.—Mean Cluster Scores and Standard Deviations for 
rricular Preference Typology Subgroups I-V, Class of 1972 


as Juniors 
am 
Subgroup I Subgroup II Subgroup III Subgroup IV Subgroup V 
N =7 N = 16 N =18 N - 12 N 2 15 
1 36.05 6.87 44.35 7.67 46.58 5.64 56.79 8.20 57.87 6.10 
2 38.49 6.41 54.16 8.27 43.86 6.43 52.72 5.51 57.22 6.82 
3 46.59 3.90 51.14 6.42 42.67 11.63 58.24 7.27 50.69 8.05 
4 47.36 8.77 47.20 8.14 54.95 7.30 59.18 8.15 40.27 4.71 
5 58.20 8.85 45.46 5.26 55.40 8.67 45.68 8.83 50.51 7.78 
6 66.14 3.77 49.92 8.19 54.73 4.61 44.91 8.18 44.50 5.51 
: 52.25 7.06 60.73 5-65 46.18 5.55 42.71 7.59 43.43 6.91 
* 61.96 5.95 50.69 5.47 55.31 8.31 37.32 3.75 47.75 6.25 
3 60.26 5.26 49.93 8.03 54.12 5.86 43.87 8.09 48.14 8.23 


T: 
‘able 3. Mean Cluster Scores and Standard De 


: viations for 
'rricular Preference Typology Subgroups I-V, Class of 1972 


ae 77 TN 
Subgroup III iri E V 
= 21 at —L 


Subgroup I Subgroup H N =2 
N - 14 N= 
S.D. Mean S.D. Mean S.D 
Cluster Mean — sp. Mean S2 Mean 2> 707 a in " none 
i 61 5.40 40.10 7-97 51.06 7.85 50.80 6.0 : : 
.83 1 . 
54.20 8.09 
2 42.68 7.35 s0.89 6.52 45.43 6.69 
63.02 3.69 . : — 
3 47.40 4-46 eat 5m — 4 8.08 
49.25 7.95 : i wwa S 
4 41.08 6.36 54.57 6.72 51.11 7.83 
47.15 6.76 . n "TE si 
55.11 5.65 
52.92 6.83 
3 45.09 8-59 
40.78 7.74 ; ga d 50.15 7.30 4221. 80 
° 44.96 7395 56 » go pue W T 
7 6.87 48 8.67 
48.83 9.13 54.80 s 7.30 54.78 6.39 46.20 4.37 
8 8.82 47. i 
ma. ax OE 49.57 6 46.17 gal 49.10 — 6.61 
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At the sophomore level, the Class of 1972 was arranged 
into two subgroups of similar profiles of curricular prefer- 
ences, and into five subgroups at each of the junior and 
senior levels. In each instance, a number of students were 

unassigned to the emerging subgroups because their profiles 
of curricular preferences were not only atypical of the 
classifiable subgroups, but also were atypical of each 
other's preferences. Curriculum planning for “atypical” 
students would have to be handled on a separate basis in 
the typology model. 

When the nine clusters are arranged in categories accord- 
ing to their mean cluster scores for each of the subgroups 
at each of the three levels, it can be seen that the priorities 
of the subgroups within a class tend to differ (Tables 4, 

5, and 6). Though there are five subgroups formed within 
the population at each of the junior and senior levels, 
they are not necessarily the same subgroups, nor will the 
membership assigned to those subgroups be constant. 

The very high (or high) and very low standardized mean 
cluster scores (between 40-45 and over 60) are generally 
interpreted as the defining characteristics of the cur- 
ricular preferences for each of the typology subgroups. In 
other words, the extreme mean cluster scores define the 
highest and lowest priorities of a given subgroup for their 
curricular preferences. The cut-off points may well vary 
depending on the range and distribution of the mean 
cluster scores for the members of a given subgroup. 

When the pertinent clusters defining the priorities of 
a subgroup for the teaching-learning process have been 
identified, the appropriate high- or low-scoring statements 
describing curricular preferences are then reviewed for 
their curricular implications. For example, at the junior 
level (Table 5), a composite statement could be written 
eombining the high-scoring statements accompanying 
Clusters 6, 8, and 9 and the low-scoring statements for 
each of Clusters 1 and 2. The resulting profile would be 
a prescription to the faculty for the kinds of learnings 
most preferred and most likely to be avoided by the mem- 
bers of that subgroup. More explicitly, junior students in 
Subgroup 1 favored organized instruction (Cluster 6), 
faculty control and guidance over evaluation of learning 
activities (Cluster 8), and much support and encouragement 
from the faculty (Cluster 9). These subgroup members 
were also less concerned about the professional aims, 
objectives, and content of the program (Cluster 1) or the 
development of problem solving skills (Cluster 2). Cer- 
tainly it would be helpful for the faculty to know who 
these students are—in advance of semester planning—so as 
to plan better for the achievement of the program objectives 
by those students and to adopt the most useful teaching 
methods. 

In the senior year (Table 6), it can be seen that the 
second subgroup of the typology was similar to Subgroup I 
in the junior year typology of the same class. Particularly 
interesting was the discovery that the membership of Sub- 
group | in the junior year typology for the Class of 1972 


was not the same as the membership of Subgroup Il in the 
senior year typology for the same Class of 1972. Though 
data concerning the group memberships are not reported 

in this article, it can be seen that it is somewhat fallacious 
to depend on stereotypes about given students for curric- 
ulum planning since they may well change. On the other 
hand, it is equally specious to wait all year to find out what 
a students needs and preferences are. If means can be found 
to identify students’ curricular preferences and preferred 
learning and teaching styles early at the beginning of the 
school year, why not do so—and plan accordingly? 

If a typology structure can be obtained for a given class 
of students at the end of the previous year’s learning 
experiences, or very early at the beginning of the current 
year's learning experiences, composite statements describ- 
ing each of the subgroups' preferences for teaching-learn- 
ing processes as normally presented in the curriculum 
can be used by the faculty to: 

1. individualize the curriculum for each subgroups . 

2. match the appropriate teacher(s) with each subgroup? 

3. choose, deliberately and consciously, the learning 
theories and teaching strategies most appropriate for 
assisting each subgroup of students to achieve the aim 
the program. 

It is essential to understand that the aim of the typol- 
ogy model is to facilitate the achievement of the common 
aims or goals of a structured program through the use of 
a variety of instructional methods and procedures which 
are based on (or chosen on the basis of) the identified 
preferences of subgroups of students within the same class 
In other words, it should not be presumed that a faculty 
would always work with, or choose to reinforce; the high- 
est priorities of a given group of students. It may well be 
in the better interests of the students to attempt to alter 
or change the students’ priorities so as to assist them to. 
meet the objectives of the program. The point i$ that wit h 
the typology model, the instructional decisions (applica: 
tion of theory, use of teaching strategy, selection of 
learning experience, etc.) are made knowledgeably by 
faculty well informed about the characteristics of their 
students. Utilization of the model by a faculty assumes the 
role of group learning and teaching in the program an 
presupposes that the faculty members are committed to 
the philosophy of individualized instruction. Use of the 
model provides a means for combining the major ad- 
vantages of both group learning and individualized 1” 
struction, 


s of 


Implications 
f typolog 


On the basis of the retrospective analysis © ep" 
formation at three levels of the curriculum for tW? PS 
arate classes of students, it appears that the an d the 


model needs to be modified. Rather than determini 3 
typology structure once, and planning accordingly isi 
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Table 4.—Nine Rationally Defined Clusters Arranged i i 
. ged in Categories 
According to Mean Cluster Scores (MCS) for CPT Subgroups I 


and Il, Class of 1972 as Sophomores. 
Subgroup Very Low Med. Low Med. Hi, i 
x Med. High High vi i 
Number (MCS 40-45) (MCS 45-50) (MCS 50-55) (MCS 55-60) d [55 
I 2* 3 5 8 
1 y 6 9 ‘ 


*MCS below 40.00 


Table 5.—Nine Rationally Defined Clusters Arranged in Categories 
According to Mean Cluster Scores (MCS) for CPT Subgroups EV, 


Class of 1972 as Juniors. 
Subgroup Very Low Med. Low Ned. High High Very High 
Nunber (MCS 40-45) (MCS 45-50) (MCS 50-55) (MCS 55-60)  (MCS 60+) 
1 3 7 5 9 
2* 4 8 
6 
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structure each year or at each level of a course sequence 

(Figure 2). Even this feature is subject to change, depend- 
ing on the impact of individualized group-oriented cur- 
ricular approaches, It well may be that once the typology 
approach is instituted, the typology structure and the 
membership of the typology subgroups would remain con- 
stant from year to year. Experimentation in this area is 
recommended to test the model further. 

If the typology is to be developed at each level of the 
curriculum, the way is open for the faculty to develop a 
Q-sort unique to the learning experiences for each level 
of the program or course sequence. The items, however, 
still should reflect the philosophy,-aims, and program 
objectives generalizable to all curricular areas. The Q-sort 
items may reflect options or alternative means to achiey- 
ing certain combinations of objectives for a given level of 
the program. Development of the typology prior to in- 
stituting the program will assist the faculty in the proper 

match of students to the learning experiences and teach- 
ing styles available in the program. 

The typology might be constructed on the basis of an 
instrument common to all levels of the program or on the 


Purpose of Program 


basis of an instrument unique to each level. Further, 

the typology might be developed at the end of one level 
of the program for planning for the next level, or at the 
beginning of the academic year at each level. These 
decisions are inherent in the nature of the Q-sort instru- 
ment developed for use, the amount of pre-planning done 
in terms of curriculum options, and the faculty’s beliefs 
concerning the amount of change which might occur n a 
given group's preferences between the end of one level 
of the program and the beginning of the next, and the 
nature of the course sequence or program. 

The usefulness of typology construction at the end of 
the program might be questioned. In the authors’ view- 
point this procedure is essential to the acquisition of 
baseline data necessary for follow-up studies of program 
graduates and for continuous ongoing program eval 
The use of the same clusters at each level of the program 
from year to year would provide a common base for the 
formative evaluation necessary for interim feedback and 
program improvement. 

The authors’ experience with the typology model 
raises the question; “Why build a battleship if a canoe 
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Figure 2.—The Revised Typology Model for Curriculum 
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will get you where you are going?" Why go to the extreme 
of developing a typology when in fact one could find the 
answer to students? preferences by just asking them? The 
authors are not convinced that exotic methods in them- 
selves are more productive or effective than simpler 
techniques. But do faculty members consider the question 
of student needs, characteristics, and preferences, let 
alone ask it, when curriculum planning? There is merit 

in developing a systematic approach to planning and im- 
plementing curriculums. Turnover of faculty alone is a 
major consideration. Is it not more effective for a patient's 
history and record of diagnostic procedures to travel with 
him to a new physician? We deplore submitting patients 
to needless or repetitious procedures. Why do we submit 
Sun Students to constant, and unsystematic, procedures 

in defining their learning needs and preferences? More 
explieity, why do we persist in assuming that there are no 
differences among our students? Not only is the typology 
method an efficient, controlled, and fairly simple-to- 
implement strategy to diagnose learning needs of students; 
it also provides the assurance of validity for those faculty 
members who find comfort in statistical verification 
rather than subjective hunches about the needs of students. 
Furthermore, the typology model provides a method for 
Creative curriculum planning which minimizes the like- 
lihood of unplanned, unwanted, or unseen changes in the 
curriculum. The authors contend that use of the typology 
model provides a means for making faculty members 
More conscious of the teaching-learning process and the 
needs of the students whom they instruct. 


The typology model, which differentiates the preferences 

oi Students, is an aid in fostering the diversity of students 
ìn higher education. When preferences of students have 

*en identified, the faculty can work with them, adopt 

Clinitive teaching strategies, and formulate learning 
Objectives and experiences to enhance those dilaenus 

Se of the typology model provides a mechanism which 
acilitates implementation of programs based on mastery 


learning. Individualized learning, through the group 
structure, may shorten the amount of time needed for 
a student to achieve the aims of a program. 

The need for conscious structuring of curricular ap- 
proaches to implementation of professional education 
programs is enhanced by the heavy demands made by 
society and the professions to clarify levels of professional 
practice, formulate new definitions and standards of 
professional practice, and to accelerate passage through the 
career ladder. The authors believe that the many contin- 
gencies facing professional educators in academic and 
clinical settings deter faculty from a conscious awareness 
of their roles as teachers and of teaching and learning as 
processes. While individualized curricular approaches to 
professional education will not solve all the problems 
faced by the educator, it is hoped that the typology ap- 
um implementation will single out the 
primary focus of the educator’s role—the student—and 
provide a means by which faculty members can reorder 
their priorities to provide greater emphasis on meeting 


proach to curriculi 


students’ needs. 
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ABSTRACT 


along two dimensions: the amount of influence it exerts upon À cha € tO! 
or NAVE. Several hundred teachers and teaching assistants is Follow thence a nand thie d degree to which this influence is positive 
field test of the instrument, with concurrent administration of the Purdue Teacher Opinione E the country participate Pedir 
morale. The high rate of return and correct completion on the EFI indicated that it is a ractical See established ip E 
dicated validity for the instrument in three important respects: (1) the pattern-of-importance rati nique. In addition, E E 
assessed patterns of physical and social distance; (2) the positive/negative ratings for particular D correspond to "d ent = - 
responding subscores of the PTO; and (3) the pattern of responses reflected aspects of internal c. rces correlated substan yy Hs 

EFI data are discussed: how to plot and interpret two-dimensional force fields, and how to use rig gala ha le a 


especially to facilitate the work of the teacher in the classroom. 


SITUATIONAL VARIABLES OF the educational set- 
ting have a significant influence on teachers and on the 
teaching/learning process in the classroom. Little is known 
about the extent or the type of influence of specific var- 
iables. Dreeban (4) has stated, "The study of the impact of 
the environment, both within school systems and from the 
external community, on the work of teachers has barely 
begun." 

In mapping the relationship of teachers to the environ- 
mental setting, the usual approach is to focus on the char- 
acteristics of teachers and their attitudes. For instance, 
the Minnesota Teacher Attitude Inventory (3) and the 
Purdue Teacher Opinionaire (2) both refer to "teacher 
morale" as the central variable. But if we are to improve 
the lot and the effectiveness of teachers, we must go be- 
yond teacher attitudes, and try to identify the salient sit- 
uational elements which affect both teacher morale and 
teacher effectiveness. 

Here the authors present a technique which charts the 
educational setting considered as the teacher's work-space. 
The name of this technique is the Educational Forces In- 
ventory, or EFT. It has been successfully tested and ap- 
plied “in the field.” In this paper data are presented on the 
initial validation of this technique, and a discussion of how 


to plot force fields, how to interpret them, and how they 
can be made useful for program implementation and for 
modifying classroom processes is included. 

The concept of a psychological field of force was de- 
veloped by Lewin (5) as a means of understanding an 
individual's behavior in relation to his environment—to 
provide à common frame of reference for the interplay ? 
"internal states" and of "objective reality." For instance» 
a child that is repeatedly prevented from approaching a 
desired object, Say a ball, eventually erects an internal 
barrier which allows him to "forget" about it and thus 
avoid further frustration, A force field reflecting this sit- 
uation will show the child in relation to the desired objects 
with z intervening barrier making it inaccessible- In 
addition, this representation may depict a whole rang 
other dynamic consequences of the barrier, such a8 4 gen 
eralized constriction of the field that may inhibit the l 
child’s locomotion in ways otherwise unrelated to the pelr 
and persisting long after the ball has gone. In short, 4 fore 
field is an “open” system for taking into account, ata /— 
somewhat abstract level, any number of facts, such as E 
cumstances in the environment and patterns of pehavio™ 

The concept of field of force was later extended t° 
deal with sociological phenomena (6). A different type ? 


eof 
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force fi 
eld, called a “ 
acteristics of z d a “phase space,” was used to chart char- 
production i qi —such as ethnic prejudice or rate of 
a factory —as function inlicitv 
forces, acting over time: mofa ndo ot 


Fo i 

eet —€— a group, as well as such phenomena as the 
Can v uction in a factory, are the result of a multi- 
TEER, Ei Some forces support each other, some op- 
Las S er. Some are driving forces, others 

seis aio orces. Like the velocity of a river, the 
(obs uct of a group depends upon the level 

dies e Speed of production) at which 

slüm6). ing forces reach a state of equilib- 


eer tone i applying force field analysis in the 
Which demonst BrOUpS was related to a series of studies 
were relativel e that efforts directed at individuals 
compared to oe in changing social behavior, as 
change achie -n of group process; and that often 
at best, if n Ne dou individual context was short-lived, 
group de: ot accompanied by corresponding changes in 
[Sr E a 
deal eae institutional change it is necessary to 
way to begin institutional environment itself. The logical 
forces, etch S E chart this environment as a field of 
directed to with more or less power, and each more or less 
The EFT n or away from the desired state of affairs. 
Purpose in à pe specifically designed to accomplish this 
teacher effe ne educational milieu. The focal variable, 
acteristic 2 or classroom process, 18 à joint char- 
ad ovg x ona and of educational settling; the 
and the for rough which it is viewed is teacher attitudes; 
and mo, ces that are charted as more or less powerful 1 
i re or less helpful are easily identifiable entities with- 


In the 
teacher's work-space. 


Devel 
De j Opment and Initial Validation of the EFI 
signi 
Ening the EFI: Background and Objectives 


T 
Var we Response Educational Program, sponsored by 
Ucatio aboratory, is designed to improve the form of 
resp . he child, especially with 
vith It is therefore concerned 
be d the effectiveness of the 
hanges in the 
e training. 
districts, in 


al experience offered tl 


e Con 
t 
Volyeq Xt Of the national Follow Throug 
he interrelat 


e 
Of s iris fundamental changes in t 
"nts, “ and community, of students, 
trainer of teachers, teaching assistants, coar 
lon An important program evaluation objective, 
Ceive was to determine bor participating teachers per^ 
have e ose characteristic features of the program which 
foree, Plications for their social psychological field of 


cher 's social-psy cho- 


T 
loo x 
Ogie Provide a description of the tea i 
veral practical re- 


fe 
ld, a technique must meet se’ 


quirements. First of all, teachers will understand its use 
and accept it. Second, teacher responses will reflect in 
dependently verifiable facts in the external world, as i 
opposed to personal or role-oriented behavior. Third th 
— will be differentiated into some messins al pit 
de to making decisions about program im- 


Elements of the EFI 


Na pit eod e the teacher's social-psychological 
i y prominent elements in the en- 
vironment in the same terms as they overtly and tangibly 
present themselves to the teacher. On the basis of previous 
menting inservice training programs, the 
ected as important in influenc- 
oom effectiveness in public 


experience in imple: 
following ten forces were sel 
ing teacher morale and classri 


schools generally: 


. Principal of the school 

Central Office administrative personnel 

Other Teachers in the school 

. Parents of children in the class 

. Curriculum prescribed by the district 

. Testing programs 

. Statewide Mandates on cer 
grading, etc. 

. Physical Facilities available 

9. Social Environment of the community 

10. Curriculum Personnel such as reading specialist, art 


teacher, etc. 


tification, curriculum, 


Naustwn re 


co 


In addition there were three program components of 
e to implementation of the Responsive 


particular relevance! 
m in the context of Follow Through: 


Educational Progra 


tor—coordinator of the Follow Through 
sponsible for admin- 


11. Program Direc 
Program within the district: re 
istration, community organization, and policy matters. 

12. Program Advisor delivers the program to the class- 


room with inservice training and in-class assistance: 

each advisor responsible for about ten classrooms. 

13. Other Adult in the classroom—teacher or teaching 
assistant: a teaching assistant in this model is a full- 


time, paid paraprofessional assigned to the classroom 
and engaged in teaching activities. 


chers? Attitudes on the EFI 

of forces in the social-psy- 

o identify not only the 
nsions along which 
d, at least two 
ecification. 


Recording Tea 
To reflect the constellation 
field, it is necessary ! 
d, but also the dime: 
red. Given any fiel 
for an adequate sp 
dividual forces upon pro 
be specified in terms of 


chological 
forces to be assesse 
they are to be measu 
dimensions would be required 
Consequently; the effects of in 
cesses in the classroom were to 
both power—or amount of influence-and affect-ort he 

degree to which influence is positive or negative in direction. 


OOOO O 
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Table 1.—Means and Standard Deviations of Scores Assigned by Teachers and Teaching Assistants on all Forces and Tasks 


of the Educational Forces Inventory 


Forces Teachers (N = 214 


1. Principal 

2. Central Office 

3. Other Teachers 

4. Parents 

5. Curriculum 

6. Testing 

7. Statewide Mandates 
8. Physical Facilities 


9. Social Environment 


11. Program Director 
12. Program Advisor 


13. Other Adult^ ; 16.77 


l The Tower the number, the higher the rated importance 
2 The higher the number, the greater the rated importance 
3 The lower the number, the more positive the influence. ` 


XA ET 
4.84 


8.78 
8.05 
7.18 
6.27 
8.45 
9.71 
7.47 
8.23 
7.59 
5.87 
5.17 
2.95 


Teaching 


S 
3.53 
3.76 
3.30 
3.36 
3.30 
3.23 
3.04 
3.34 
3.02 
2.98 
3.56 
3.28 
3.12 


Assistants 


Task 22 


X sd 
12.08 10.12 


4.12 
4.19 
6.63 
6.41 
3.18 
2.32 
6.20 
4.15 
4.41 

10.08 

11.20 

28.43 


6.95 
4.86 
7.77 
7.73 
5.07 
3.49 
8.03 
4.78 
6.18 


11.28 
10.78 
21.42 


4 The teacher is rating the teaching assist 
ant and the teachin 
8 assistant is ratin 
g the teacher. 


In completing the instrument, the respondent—teacher 
or teaching assistant—is asked to evaluate the set of 13 
forces by carrying out three successive tasks: 

Task 1: The 13 forces are ranked in order of their im- 
portance in influencing teaching. The force with the strong- 
est influence, either positive or negative, is given the rank 
of 1, the least important the rank of 13. 

Task 2: Each force is assigned a weight according to its 
relative importance in influencing teaching. A total of 100 
points are distributed among the 13 forces, with the most 
important force assigned the most points. Any pattern 
of assignments is permissible: the respondent might choose 
to distribute the points evenly among the 13 forces or al- 


locate them all to just one or two, with no points to the 


rest. 
Task 3: Each force is rated on a scale of 1 to 5, ac- 


cording to its positive/negative effect on teaching, with a 
rating of 1 indicating strong positive influence, and a rating 
of 5 indicating strong negative influence. 


Design for Field Testing the EFI 
A field test was planned to determine if the EFI was a 
viable technique, and specifically if it met its objectives 


1.82  .99 
2.65  .90 
2.63 -91 
2.51 1.01 
2.33  .99 
2.76 -96 
2.87 -9% 
2.51 1.07 
2.66 1.00 
2.40 -90 
1.88 -95 
1.76 -9% 
1.40 -8 


and satisfied the criterion related to practical utility. In 
order to test the clarity and acceptability of the procedures 
a trial with a large, unselected group of teachers was plan- 
ned. To test whether the instrument did, in fact, reflect 
objectively ascertainable factors in the teacher’s work- 
space, the degree of similarity in the response patterns © 
respondent pairs would be related to their degree of sim- 
ilarity along important dimensions of the social- 


peye ological field, such as professional role (teache 
en à 
aching assistant) and operational unit (classroom: $ 


r vS- 


ina) Third, it was planned to relate responses 0n the 
to those on an older, established instrument. 


A Concurrent Validity Criterion: The Purdue Teacher 


Opinionaire 


chool, 


The Purdue Teacher Opinionaire (PTO) was chosen This 


a referent for assessing concurrent validity of the 1^" " 
instrument had been developed and validated as a mea 
of teacher morale (2). While the objectives of the E 
tend far beyond teacher morale, no instrument wa 
with a scope equally broad. The PTO manual cites seve 
studies which reported that scores of teacher correlate 


d 
^ 


appreciably with those of their principals, as well as W' 


ex” 


four 
ra 


gure 


P 
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Table 2.—Rankings of Scores Assi; 
Assigned to Fi 
Tasks by Teachers (V = 214) and Teaching een Fe 3 a 


Ranks for Task 1 


Ranks for Task 2 Ranks for Task 3 


Teaching 


Forces Teachers Assistants 
Principal 2 


Central Office 12 


Other Teachers 
Parents 

Curriculum 

Testing 

Statewide Mandates 
Physical Facilities 
Social Environment 
Curriculum Personnel 
Program Director 
Program Advisor 


Other Adult 
——— 


Spearman Rhos 


cts of the educational setting 


ol . 
ther logically related aspe 
such as teacher turn- 


th 

e be assessed independently, 
PTO has 100 items which, in 
yield ten sub- 


factor analysis: 


wen current form, the 
i 
mg tion to the overall score for morale, 
Ore. a : n » è sp 
s reflecting dimensions identified by 


LI 


- Teacher Rapport with Principal 
- Satisfaction with Teaching 
- Rapport among Teachers 
* Teacher Salary 
Teaching Load 
Curriculum Issues 
Teachers’ Status 
- Community Support of E 
l. School Facilities and Services 
J Community Pressures 


oc 


em 


mmo 


ducation 


M 
*thod and Sample 


hs spring of 1972, some 
diffe of teaching assistants, sae 
PM. states, were working with th ve 
ollow to implement a Responsiv? Educa b 
Stade A Togh classrooms, kindergarten i m sd 
copy he on-site program advisors in each ^. Sf m 
ach die the two instruments, the EFI and Be i ed 
og Cacher and teaching assistant. The instructions V^ 
mplete and return them directly to the Laboratory: 


rs and a like 


300 teache s 
| districts 1n 


9I 


t 
t 


Teaching Teaching 


Teachers Assistants |Teachers Assistants 


3 3 
n 10 
8 9 
9 135. 
5 5 
13 12 
12 13 
7 7.5 
10 n 
6 6 
4 4 
2 2 
1 1 
——À 
(.97) 


ided for several items of descrip- 
tive information to be filled in by the respondent, such as 
ole (teacher/teaching assistant), and amount of 
Strict confidentiality was pledged, 
allowance was made for the option 


The form for the EFI prov 


name, age; T 
teaching experience. 
and in addition specific 
of leaving off the name. 


Results 
The forms were received by a total of 572 teachers and 
se total recipients, 428 (or 75%) 
turned at least the EFI; both 
turned by 394 (214 teachers 


teaching assistants. Of the: 
correctly completed and re 
forms were completed and re 
and 180 teaching assistants) of these 428. For teachers, 
of years of age reported was 35, and of 
teaching experience, 9. All but two of the teacher- 
respondents were women, and all but twenty had pre- 
viously taught in the Follow Through program. 

Forces data were analyzed separately for teachers and 
for teaching assistants. For each of the 13 forces, the 
following statistics were computed for both groups: the 
mean of the ranks assigned in Task 1, the mean number of 
points assigned in Task 2, and the mean ratings on 
e continuum given in Task 3. These 
ponding standard deviations are re- 
]. These means were then rank-ordered 
are presented in Table 2. 
oment correlations were 
a each of the three 


the mean number 


weight 
the positive/ negativ 
means and the corres} 
ported in Table 
n task: the rank scores 
force, product-m 
tween scores assigned or 


withi 
For each 


computed be 
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Table 3.—Correlations* between Scores on Tasks 1, 2, 3 for Each 
of the 13 Forces for Teachers (N - 214) and Teaching Assistants 


(N 7 180) 
Forces Tasks 1, 2(rating x points) l, 3(ranking x pos-neg ratin 2, 3(points x pos-neg ratin 
— Teacher Bou Assistant | Teacher — Teaching Ass iun fase Teach ning miu 
1. Principal 63 52 38 54 17 38 
2. Central Office | 58 41 17 25 03 23 
3. Other Teachers | 68 40 | 48 28 32 26 
4. Parents 15 46 33 32 08 29 
5. Curriculum 61 44 40 27 39 23 e 
. " as “Correlations involving task 2 ae 
6. Testing 3 13 22 -05 16 been inverted in sign to adjust itie 
7. Statewide variations from task to task in = 
Mandates 51 40 13 06 14 o8 scheme for assigning number S 
task 1, high influence receiv! 
ical , ich influ- 
i facil ities 5 i 2A 39 7 42 rank of 1; on task 2, high influ iit 
n ence received the most pou Sd 
+ Socia e itive re 
d Environment 69 3 09 10 01 15 task 3, the most positiv: 
a score of 1. 
10. Curriculum 
Personnel 45 16 42 20 36 09 
11. Program Director} 52 41 45 42 25 " 
12. Program Advisor 62 46 35 4 17 32 
13. Other Adult 47 42 28 25 22 25 
Average correlation | (.53) (.41) (.29) (.29) (.18) (.25) 


Table 4.—Intercorrelation between Task 3 of the Educational 
Forces Inventory (EFI) and the Ten-Factor Subscores of the 
Purdue Teacher Opinionaire (PTO) Collected on 394 Follow 
Through Teachers and Teaching Assistants 


= ? 
& PR E PTO Ten Factor Sub-Scores 
R 5 3 $ a o£. É 
E ò £ S 5 s 2 a 
= = o P ^ m5 E S $ 
< c [3 E: v ^ 5 El - 2 
= = e 5 E] go" A 3 wu 
t E < e Ex] 3 - > m" > 
»" E "m E i 3 [4 x "s = 
5 5 s S m g os 5 € 5 
2 A a c] o E o E 8 E 
8 8 B8 $ 3 & 3 E 2 E 
& a 4 [z4 p: o - [x] a 8 
1 2 3 4 5 6 7 i 
EFI - 13 Forces 8 i 10 * All entries have been inverte go 
" 27 38 12 22 42 09 12 sign to compensate for a di 
1. Principal 10 02 in direction of scores: on the Pupigh” 
fice 17 15 16 07 17 24 3 n high score indicates good" OT «j» 
2. Central Offic 06 morale, whereas the EFI dens most 
3. Other Teachers 20 17 15 15 24 03 12 o5 16 as the most positive, and “5 
A negative. 
04 1 18 13 0| 316 36 G3 08 ;on points 
4, Parents 15 **Correlations at the intersectinm o 
z. turiani 18 10 03 15 14 12 24 08 o3 of corresponding ETI forces 
x E 3i di 03 10 19 13 m ia " factors have been circled. 
7. Statewide Mandates "Us “04 0» 0 08 s o 9 ^00 o 
8. Physical Facilities 21 26 22 08 0 2 m x 03 
9. Social Environment 09 17 © 00 -03 n æ jg 04 95 
10. Curriculum Personnel T 06 32 R B 312 dg p "n o3 
11. Program Director 30 29 16 -01 18 22 20 ag 10 o0 
12. Program Advisor 28 27 2E = ii ST us g eA 14 
13. Other Adult 14 20 20 -07 15 17 146 15 16 22 
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different pairings 
vcr pairings of the three tasks, and are presented in 
The ten-factor subscores of the PTO defined in the 
Saec were correlated with scores on Task 3 of the EFI 
or all 13 factors. The correlation matrix is presented in 


Table 4. 


Patterns of Teachers and Teaching Assistants 


- bor 1 suggests that the power and affect attributes 
ach force were perceived similarly by teachers and by 
teaching assistants. Table 2 confirms this. In all three tasks, 
e calculated for the 13 factors gave rise to 
Surfin T he same rank-order sequence within each of 
feo groups. For the three tasks, the Spearman rhos 
etween the two groups are: .95; 85; and .97. Both groups 
considered Other Adult, Principal, and Program Advisor 
to be the most important and most positive influences: 
the mean scores of these three ranked among at least the 
top four on all three tasks for both groups. Program Dir- 
ector and Curriculum were also rated both influential and 
Positive, Forces such as Central Office, Testing, and State- 
wide Mandates were considered to be the least important 
and the least positive influences. 
These data suggest that the various 
as both more influential and more valued in direct 
to the extent of their physical and psychological proximity 
ta the classroom: the nearer the force is located, the more 
IL is seen as powerful and positive, and the further away, 
the more it is seen as weak and less positive- This direct 
correspondence between EFI scores and a salient and log- 
ically related aspect of the teacher's social-psychological 
field offers an indication of construct validity for the in- 
Strument. Moreover, these perceptions apply across work- 
roles: they are as true for a certified teacher as for a para- 
Professional teaching assistant. This is another indication 
e construct validity in that the correspondence between 
"FI scores and the educational setting is not primarily de- 
Pendent on respondent characteristics. 


forces are perceived 
relation 


Relationships among the Three Tasks 
d in Task 1 and 


"i an particular forces, the scores assigne Tak and 

sk 2 are correlated moderate-to-high. This is to be ex 
Pected from the instructions, which ask for different ex- 

pressions of judgments along the same dimension—relative 
rene of influence. This finding may be taken to be a 
ection of internal consistency: 

he correlations between either oF 77. 

witht are low-to-moderate. This aga! ceni d 
the instructions: Task 3 asks for ratings 2 long A 

“nsion that is distinct from the one involved in the 0 er 


9 tasks za : luation rather than relative 
—po e valuation c 
positive/negativ Hi eite ambiti, 


f the first two tasks 
n is consistent 


Str 
t ength of influence. These low- 


en, indi : eptually, 
Ta n, indicate that, psychometrically as well as concep 


: icti e one 
ungs Presents a dimension Hai S ietan aeie dimen- 
si "lying the first two tasks. While distinct, 


za all), the more a force is seen as having strong influence 

mn . i 
more it is seen as having also a positive direction of in- 
fluence. 

The correlations between Task 3 and Task 2 are lower 
than those between Task 3 and Task 1. Presumably this is 
because the scores on Task 2 are more volatile: many res- 
pondents assigned all of their 100 points to just one or two 
forces, leaving zero to the rest. 


Relationship of the EFI to the PTO 


The PTO includes dimensions that are, from their 
labels and from their definitions in the manual, similar 
to forces on the EFI. On the PTO a high score indicates a 
“good” rating, or high morale, whereas the EFI Task 3 
defines a score of 1 as most positive, and a score of 5 as 
most negative. To adjust for this inversion of scoring dir- 
ection, the signs on the correlation coefficients in Table 
Ahave been inverted to make a positive item reflect a 
positive relationship between corresponding concepts. 
Positive correlations are evidence of concurrent validity. 
In fact, moderate-to-high positive correlations occur in 
three of the five instances where the name-to-name relation- 
ship is self-evident: 
68 between (1) Principal and (a) Rapport with 
Principal 
43 between (13) Teacher/Assistant and (c) Rapport 
among Teachers 
35 between (4) Parents and (h) Community Support 


In the other two instances the correlations are low, but 


also positive: 
26 between (5) Curricu 


Issues 
20 between (8) Physical Facilities and (i) School 


Facilities and Services 
This series of correspondences between EFI forces and 
subscales of the PTO offers clear evidence of concurrent 
tual match between correspond- 


validity; since the concep 
ing elements is only approximate, we did not expect uni- 


formly high correlations. 


lum and (f) Curriculum 


Program Implementation 

these data for evaluating program im- 
k at the influence exerted by the 
the program. All three are im- 
portant for delivering program concepts and techniques to 
the classroom. The presence of the Teaching Assistant, if 
properly utilized, improves the teacher/pupil ratio bya 
factor of two, making the child’s classroom experience 
sive to him. The Program Advisor works 

slate program goals and con- 
The Program Director pro- 
rt, especially to the 
possible by 


Implications for 


One way to apply 
mentation is to loo 


ple 
present 


three forces that re 


more respor 
directly with teachers to tran: 
: oom process. 
tive and social suppo 
whose very role is made 


cepts into classr 
vides administra 
teaching assistant; 
Follow Through. 

The pattern 0 
Follow Through 


indicates that the impact of 


f EFI scores 
owerful 


on classroom teaching was both p 


9 h 
"8 are by no means orthogonal: in most patterns (thous 
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and positive. Table 2 presents the rankings of the mean 
scores, taken separately by task and by teacher/ teaching 
assistant. The teachers reported the force Other Adult, 
that is, the teaching assistant, to be highest of any force in 
amount of influence on both Task 1 and Task 2, and also 
the most positive in direction of influence on Task 3. The 
teaching assistants, for whom this force represents the 
teacher, returned the compliment and also scored it highest 
on all three tasks. This reciprocal appreciation suggests good 
rapport within the classroom and mutual cooperation on 
common goals. The Program Advisor was also viewed as 
both powerful and positive—the rank of importance scores, 
on both Task 1 and Task 2, was fourth for teachers, third 
for teaching assistants; and the degree to which this in- 
fluence was judged to be positive, on Task 3, was second 
only to Other Adult, for both groups. The Program Director 
was judged eighth in extent of influence by teachers and 
fourth by teaching assistants, but both groups judged this 
force as fourth highest in the degree to which the influence 
was positive. 
The differences in the pattern-of-importance ratings 
can be seen more explicitly by reference to the Task 2 
mean scores in Table 1. Since the total number of points 
assigned to all 13 forces is 100, the number of points and 
the percentage of total points is the same for any one 
force or for the sum of any number of forces. By adding the 
Task 2 points for forces 11, 12, and 13, we see that the 
number of points assigned to all three Follow Through 
components was 33.2% of the total for teachers, and 49.7% 
for teaching assistants. It appears that the differences ob- 
served between teaching assistants in their respective pat- 
terns-of-importance rankings relate not simply to differ- 
ences in the way the two groups perceive individual Follow 
Through components, but actually reflect a clear-cut dif- 
ference in their perceptions of the program as a whole. 
These differences between teachers and teaching assis-- 
tants in their estimations of the influence of the Follow 
Through program and its components on the classroom can 
be understood in terms of role differences in the social- 
psychological impact of Follow Through. The advent of 
Follow Through created a new role for the teaching as- 
sistant, and she sees the inservice training and other as- 
sietanee provided hy program personnel as inportant fe- 
sources in fulfilling this role in the classroom. But the 
teacher was there before Follow Through, and will continue 
to rely on/be influenced by other elements, such as the 
Curriculum and the Social Environment, For instance, the 
teacher attaches greater importance to. the curriculum be- 
cause she has to take greater responsibility with respect to 
roles and mandates built into the educational System in- 
dependently of Follow Through. 


Using the EFI for Force Field Analysis 


Two Dimensions of the EFI 
'The strength of the EFI lies not so much in its ability 
to measure attitudes in specific arcas as in its ability to 


EEE 


reflect two different aspects of influence, power and affect, 
: : : : ; e 
simultaneously. This feature is essential for charting a fore 


field. 


Plotting a Force Field 


Force field analysis uses a plot locating each force along 
the two dimensions of influence, power and affect. For. . 
example, in Figure 1 there are two plots, each representing 
a different school district. The two axes represent the two 
dimensions: vertical for power, horizontal for affect. 

Each district's own norms were used to calculate eo 
coordinates along each dimension: Task 1 scores were a 
for power, Task 3 scores for affect (Task 2 scores were no 
utilized). To calculate the coordinates along the power 
dimension, all Task 1 ratings were totaled, across all rater$ 
within the district and across all 13 forces, to get an eme " 
mean; the totals for the 13 forces were used to calculate the 
overall standard deviation; then, for each force separately 
the mean across all raters was subtracted from the over? 
mean and divided by the overall standard deviation to im 
a z-score deviation. The same procedure was used for ca 
culating the affect coordinate using Task 3. These two 7 
Score deviations were then used as the coordinates for z ding 
ing each force as a point on the plot. All processing, inC'" 
charting, was carried out by computer. . í 

Forces that appear in the upper right-hand portion o d 
the grid are those rated by teachers as having the strong 
and most positive influence. Those forces located in the 
upper left-hand quadrant are also strong, but exert less » 
itive influence, The EFI yields unique force field pan 

for different School districts, In District C, the Princip? t 
and Program Director exert positive influence. In Distric 


A . . ' j 
E, both the Principal and Program Director are percetv 
as less influential. 


cat- 


pos- 


Significance and Use of Force Field Analysis non£ 

The EFI can he used) ty identify patterns of foret foe ) 
schools within a district Figure 2 shows how six prine 
were rated by their teachers in District E. Overall; 1° hilly 
principals in this district were rated as having a low: in i 
positive influence, When individual principals are pls 
Considerable variability is evident. 

This Procedure yields important information. 

of forces may be viewed within a particular schoo T 
OF specifie forces may be examined across various 8 the 
and districts, Moreover, comparisons can be cip Wes 
basis of local, regional, or national norms. At the P 

aboratory this information is being used to assist 4 
Program implementation and program improvemt? y 

The force field pattern of an individual district ma 

Point up a problem with respect to some element - 
the educational Setting. The influence of teachers ae a 
Perceived as low or not Loo positive, indicating pan i 
need for inservice training. Parents may be viewed E for 
Important or a negative influence, suggesting the nef 


A pattern 
| distric^ 
foo” 


e 
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Principal 
Central Office 
Other Teachers 
Parents 

Curriculum 

Testing 

Statewide Mandates 
Physical Facilities 
Social Environment 
Curriculum Personnel 
Program Director 
Program Advisor 
Other Adult 
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i i mo 
gure 1.—Plots of z-Scores of Forces that Influence Teachers in Two Communities 
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community-oriented efforts. It may be that an important 

resource, such as the Program Advisor, who is responsible 
for inservice training, is viewed as positive but weak in 
influence, suggesting thte need for special training for this 
important staff trainer. Or it may be that a resource is 
identified as being strong but not very positive in influence, 
suggesting a need for a more basic reorientation of the 
direction taken by that resource. 

Specific information about such conditions provided by 
force field analysis can be a first step in dealing with the 
underlying problems. For example, the information de- 
picted in Figure 2 is a revealing commentary about the way 
principals in the various schools of a district are relating to 
the work of teachers. This information was actually used as 
the framework for a series of on-site workshops conducted 
for those principals that dealt with the principal's role and 
function as it relates to the educational program being im- 
plemented in the district. 

Currently, the most valuable application of forces data 
is in educational planning, and especially with reference to 
preservice and inservice training of teachers. 

Other important uses of the forces data allow an educa- 
tional change-agent to monitor the effects of programs on 
the system. In such a way, district staff responsible for staff 
development can be monitored, with regard to teacher re- 
ceptivity. Combined with other implementation data such 

as how much children are learning and how teachers are 
performing in the classroom, the EFI data can assist in the 
explication of the change process. 

Further, as we learn more about how a packaged pro- 
gram is best delivered, implemented, and institutionalized 
within a school district, the EFI technique can be of tre- 
mendous value. It can help to document a school district’s 
pre-existing conditions in an analytic way. Such conditions 
can more adequately be studied as they affect the implemen- 
tation of a specific program. Ultimately patterns of con- 
ditions can be linked with delivery strategies to achieve 
maximum effectiveness in program implementation. 


Summary and Conclusions 
The EFI was constructed to reflect adequately the con- 
cerns of teachers. It provides information about the pat- 

s of influence exerted on the teaching/learning pro- 
Rn significant elements in the educational setting. This 
= ae is useful in meeting the needs of teachers, in 
Hi program implementation, in setting priorities, 

: luating program impact. From the summary of 
pn e presented below, it is evident that the EFI instru- 
ata as irn 


ment is both practical and valid: 


ment correspond to actual, 

Sie fuent that can be considered and, if necessary, 
tangibl e the advantage of the teacher, the program, 
modified to the ational process. 


ore the educ " 
and, e món aaron assistants found the instrument 
—Teachers an 


stand. 
simple and easy to understa 


—The items on the instru 
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—The scores assigned to forces were clearly related to 
the objective relationship of the forces to the work of 
the classroom: e. g., the closer the force to the class- 
room, the greater and more positive the influence at 
tributed to it. The effect due to the objective relation- 
ship of a particular force to the work of the classroom 
was far greater than that due to the effect of profes- 
sional role status, as to teacher/teaching assistant. 


FOOTNOTES 


1. The authors are grateful to Dr. Glendon P. Nimnicht or 
originally pointing out the need for this technique and for Wen 
help in conceptualizing the problem; and to Dr. Stephen She! 
for his help in data analysis. 

2. Follow Through is a federally funded program t 
comprehensive services to children from kindergarten t 
grade. A sponsor, such as, in this case, the Far West Laborat 
Educational Research and Development, works with paruo 
School districts to implement a specific instructional model, H etc. 
providing curriculum materials, inservice training for local stab 
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ABSTRACT 
e". (Educational Forces Inventory (EFI) isa technique that charts the influence of elements in the educational setting from the 
Ping si ENG the classroom teacher. It uses ratings collected individually from teaching adults, but is especially useful for map- 
tied out in ES as opposed to psychological variables. Here the validity of the EFI is explicated by the findings of two studies car- 
dimensio ciological contexts: (1) for pairs of respondents, the closer their positions within the field of forces, along field-relevant 
ns of either geographical-organizational distance OF work-role category, imilarity in the EFI patterns gener- 
individual schools to teacher-generated force field patterns with a 
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Description of the EFI’ 


Elements of the EFI 

The EFI is based on forces in the teacher's social-psycho- | 

logical field which have à significant influence on morale 

m effectiveness. There are 13 forces, selected 

and delineated to correspond to $ 
her's work-space as they 


classroom teat 
resent themselves to the teacher. Ten 


relevant to public schools generally: 

Principal of the school 

Central Office administrative personnel 
Other Teachers in the school 

Parents of children in the class 
Curriculum prescribed by the district 


Testing programs 

. Board of Education 
Physical Facilities available 

. Social Environment of the community 


10. You, Yourself 
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12. Program Advisor— delivers the program to the 
classroom with inservice training and in-class 
assistance: each advisor is responsible for about 
ten classrooms. 

13. Other Adult in the classroom—teacher or teaching 
assistant: a teaching assistant in this model is a 
full-time, paid paraprofessional assigned to the 
classroom and engaged in teaching activities. 


Recording Teachers? Attitudes on the EFI 


In completing the instrument, the respondent— 
teacher or teaching assistant—is asked to evaluate the set 
of 13 forces by carrying out three successive tasks: 

Task A: Each force is rated on its importance in in- 
fluencing teaching, on a scale of 0-9. A rating of 0 in- 
dicates no influence, a rating of 9 indicates a strong in- 
fluence of either positive or negative effect. 

Task B: Each force is assigned a weight according to its 
relative importance in influencing teaching. A total of 
100 points are distributed among the 13 forces, in direct 
proportion to their amount of influence. Any pattern of 
assignments is permissible; for instance, the respondent 
may choose to distribute the points evenly among the 13 
forces or allocate all 100 points to just one or two of them. 

Task C: Each force is rated on its positive/negative 
effect on teaching, on a scale of 1 to 9, with a rating of 1 
indicating strong negative influence, and a rating of 9 in- 
dicating strong positive influence. 


Plotting a Force Field Pattern for Schools 


Force field analysis generates a plot locating each force 
along the two dimensions of influence, power and affect. 
For example, Figure 1 shows two plots, representing the 
EFI patterns constructed from the aggregate of teachers 
in each of two different schools in the same district. 

The two dimensions of the page represent the two dimen- 
sions of influence assessed: vertical for power, horizontal 
for affect. Task A scores were used for power, Task C 
scores for affect. Scores of Task B overlapped considerably 
with those of Task A and were therefore not utilized. 

In each plot, the position of each force is determined by 
a pair of coordinate values that correspond to z-score 
deviations of the school relative to the whole district. For 
instance, to calculate the Task A (power) coordinate of 
Principal, the mean rating for Principal on Task A, over all 
respondents in the schools, is subtracted from the mean 
rating for Principal over all respondents in the district. 
This difference is then divided by the standard deviation 
of the set of means for Principal for all schools in the dis- 
trict. The procedure for Task C (affect) is analogous. 

Forces that appear in the upper right-hand portion of 
the grid are those rated by teachers as having the highest 
and most positive influence on their teaching in the class- 
room. Those forces located in the upper left-hand quadrant 
are also rated as having relatively high but less positive in- 
vis Two patterns of influence are evident for the two 


Schools. In School A the Principal exerts a strong positive 
influence; in School B the Principal is less influential. In 
both schools, the Teaching Assistant is seen as having a 
strong influence; in School A, however, this influence is 
perceived as distinctly less positive than it is in School B. 


Practicality, Validity, and Reliability of the EFI 


Preliminary Indications 

The initial paper on the EFI reported data on its prac- 
ticality and validity: 

—Teachers understood and accepted the task of com- 
pleting the instrument, as evidenced by their ability and 
willingness to respond. [Sec (2:29).] 

—Ratings assigned were primarily related to the ex- 
ternal factors in the school/ work settings, as opposed to 
individual characteristics of the respondents, such as pro- 
fessional role. [Sec (2:31).] . 

—Concurrent validity was demonstrated with reference 
to the Purdue Teacher Opinionaire, an older, ae ] 
instrument that was administered concurrently. [Sec (E97 


More Data on Validity and Reliability: Four Studies 


The four studies reported here, with new data in new 
contexts, enlarge upon these preliminary indications of 
validity and also present data on reliability/stability. 
Study I and Study II deal with aspects of validity in the 
interpersonal context, using data obtained in a program- 
wide survey of all teachers and teaching assistants working 
with the Responsive Educational Program in the spring 9 
1973. The other two studies deal with aspects of the re- 
liability/stability of force field patterns aggregated over 
individuals in organizational units of various levels: Study 
III uses data obtained from a single district, on two — 
occasions four weeks apart, in early 1974; Study IV is b 
on data collected in program-wide surveys in spring 1 
and Spring 1974. 


ased 


Validity of EFI Patterns Tested in Two Interpersonal 
Contexts 


rgs ent 
Study I: Validity with Reference to Patterns of Agreem 
among Respondents 


u 
It follows from the work of Lewin (1) that the closer '^ 
each other two observers are located within a field of 1 
Social-psychological forces, the more alike they will oe 
the way they perceive these forces impinging on them e 
on their work. For classroom teachers, two important “fe 
mensions may be used to define relative position in the gam 
of forces corresponding to the educational setting: 
izational operating units such as class, school, and dist? 
and (2) work roles such as teacher and teaching assista" ; 
Since perception of the field of forces is dependent pet, 
the observers position within it, one way to check whe e 


the EFI technique generates valid results is to examine 
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patterning of data along these two dimensions. If the pat- 
terns accurately reflect reality, it is to be expected that the 
nearer any two respondents are to each other along these 
dimensions, the more alike they will be in their responses such 
about the field. In other words, a validity criterion is such 
that, with respect to organizational units, it will be the 
case that, on the average: 


—pairs of respondents teaching in the same classroom 
will be more alike in their responses to each other 
than pairs teaching in different classrooms; 

—pairs within the same schools will be more alike 
than pairs in different schools; 

—pairs within the same school district will be more 
alike than pairs in different districts. 


Similarly, with respect to professional roles, it will be 
the case that, on the average, pairs of either teachers or of 
teaching assistants will be more alike in their responses than 
mixed pairs of teachers/teaching assistants. 

The Responsive Education Program provides systematic 
variations along both the dimension of organizational oper- 
ating unit and the dimension of professional role. The pro- 
gram extends over 14 school districts in 12 states, grades 
kindergarten through third, thus providing an opportunity 
to compare individuals that are both administratively and 
geographically distant. At the same time, since one teacher 
and one teaching assistant are assigned to each classroom, 
it is possible to compare individuals who work quite closely 
together. Moreover the pattern of parallel assignments pro- 
vides the opportunity to compare easily and meaningfully 
across the dimension of professional role as exemplified 
by these two categories, teacher and teaching assistant. 

An EFI survey was directed to all of the roughly 700 
teachers and teaching assistants working within the Res- 

onsive Education Program during the school year 1972. 
73. Of the 604 returns, 29 (mostly from teaching assistants) 
could not be fully processed: 24 because they could not 
be identified as to the respondent's classroom and/or work 
role; and another 5 because they were incomplete on 
more than one task. There remained 515 valid returns, 

304 from teachers, and 211 from teaching assistants. 

For each of the three tasks the average correlation was 
computed for teacher/teaching assistant pairs: (1) within 
the same classroom; (2) in different classrooms within the 
same school; (3) in different schools within the same dis- 
trict; and (4) from different districts. Average correlations 
were also computed separately for pairs of teachers and 
for pairs of teaching assistants for the last three-categories, 
Since correlation coefficients are not additive, the average 
correlation for a particular category of pairs was computed 
indirectly as follows: the product-moment correlation 
coefficients were computed for all pairs within the category, 
then converted to their z-score equivalents; next, the mean 
of the z-score equivalents was computed and converted 
back to a correlation coefficient. Because of the excessive 


number of potential pairings in the “Different Districts” 
category (around 40,000), the average correlations here 
were based on a sample of 1,000, randomly selected out of 
all possible pairs. Where a task was not completed, the 
respondent was dropped from the pairings for that task 
but included in those tasks that were complete. All pn 
cessing, including matching, calculating, and randomizing, 
was carried out by computer. The results are presented in 
Table 1, and charted in Figure 2. 


The patterns of responses on the EFI reflected a direct i 
relationship between relative closeness of the organizationa 
operating unit of class, school, and district on the one hand, 
and congruity of perceptions on the other. This trend pre- 
vailed throughout and was most evident for the Teacher 
Teaching Assistant pairs (different-role) on Task A, where 
the correlations diminished in regular steps from 49 to j 
.31. The differences were less dramatic for either Teacher 
Teacher or Teaching Assistant/Teaching Assistant (same? 
role) pairs mainly because one of the end points on ae 
unit’s dimension—the “Same Class" category, denoting _ z 
two respondents teaching in the same classroom—was mise 
ing, thus reducing the range of variation. 

From these data it can also be said that the perceptions 
of the field of forces are more similar within each of the 
two job categories than across them: correlations for gs 
different-role pairs are generally lower than correspon‘ vd 
ones for same-role pairs. In Figure 2, this relationship 1 
flected in the fact that the line representing different- 
pairs is the lowest of the three. 


role 


With respect to both these dimensions, geographical 
organizational distance and professional role, the eh fc 
flects real differences in the social-psy chological aa ir 
consistent ways. For pairs of respondents, the closer cl 
positions within the field of forces, along either dimensi i 
the greater is the similarity in thcir patterns of nnn 
set of forces on the EFI. 


si 
Study II: Validity of the Force Field Pattern: Recogniti? 
s of 


To test whether the EFI leads to valid description fie 


the constellation of forces in a particular school, force e 
plots may be compared to independent assessments - ar- 
by knowledgeable individuals. Such a comparison saa plot 
ried out using a forced-choice procedure. A force fiela 3 
was constructed for each of the 61 schools in the 13 di? 
tricts in which the Responsive Education Program ín 
tioned in at least two schools (in the 14th district all Fo 
low Through classrooms were in a single school). These " 
plots, identified only by a code number, were then et 
buted to the respective districts, and the district progr 
personnel involved in supervisory-level positions Lat 
directors and program advisors) were asked to ident! 
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school represented by each plot. The results are presented 
in Table 2. 

The six schools in District 4 had been divided into three 
sets of two, in correspondence with the program advisor as- 
signments. By chance alone, the expected number of cor- 
rect matches, or “hits” would have been one for each set of 
schools to be matched, or 15 in total. The actual number of 
hits was 27 out of 61 possible schools. Thus, the improve- 
ment over chance is significant, both practically and statis- 
tically (z = 2.3, p <.02). 

Some districts had greater accuracy in matching than 
others. In every case where there were only two schools to 
choose from, the choices were correct. This was the case 
with Districts 1, 2, and 3, in each of which only two schools 
were involved, and in District 4, where the set of six schools 
had been divided into three sets of two, and the matching 
for each pair was done by the program advisor working 
with the particular pair. In contrast, the number of hits 
scored in the four districts with six or more schools to 
choose from was only slightly better than expected by 
chance alone: expected hits, 4 of 27; actual hits, 6 of 27. 
Where the number of schools to choose from was four or 
five, the improvement over chance was intermediate be- 
tween these two: expected hits, 5 of 22: actual hits, 9 of 22. 

In general, the matching was done by the program direc- 
tor, whose district-wide responsibilities were at least one 

step removed from the classroom. In the smaller districts, 
however, the program director might also double as program 
advisor with in-class activities. And in District 4, the match- 
ing was done by the particular program advisor involved in 
each subset of two schools. Evidently, accuracy in match- 
ing was directly related to proximity to the classroom in 
terms of administrative distance and work role. 

To explore this notion further, another round of judg- 
ments was collected. The schools in three of the larger 
districts (8, 12, and 13) were partitioned by the program 
advisor in charge, who was requested to carry out the match- 
ing (Districts 10 and 11 could not be polled in this round 
because of particular difficulties existing there at the time, 
such as excessive turnover of staff; District 9, though it had 
Follow Through in 5 different schools, was actually one of 
the smaller districts, with only 14 classrooms and 2 pro- 
gram advisors; and District 12 could not be included þe- 
anse each of the program advisors was involved in only 
one school). The results are presented in Table 3. 

By chance alone, the expected number of hits would 
have been 8 of 17, and the actual number of hits was 15. 
Table 4 summarizes the data from both rounds of matching 
in terms of the number of schools in the set to be discrim- 
€ were 29 cases where matching involved selecting 
from a set of two or three schools at a time: 12 in the first 
round, in Districts 1, 2, 3, and 4; and all 17 in the second 
round. For these 29 the number of hits expected from 
chance alone was 14, and the actual number scored was 27. 
Thus, the index of improvement over chance, Kappa 
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Table 2.—Number of “Hits” in First Round of Matching Schools 
to Force Field Plots, by District 


Mizer, | Pegtes Sme | tate, | taint. ga 
Mies of Schools | "y chance "Hits"  |Excess of Chance 
J d 1 z 1 
2 2 1 2 1 
3 z 1 2 1 
a 6 3 6 3 
$ 4 1 0 
8 4 1 2 1 
a 4 1 2 1 
e 5 1 1 0 
a 5 1 3 2 
10 6 i : ‘i 
n 6 i d p 
12 6 1 » d 
Lu 9 1 2 1 
Total 13 61 15 " 12 
tor, they 


D irec’ 
Instead of the six plots being sent to the district's program d direct- 
were sent, two each, to the three program advisors who had been working 
iy with the two schools for a year or more. 


= .87 for sets of 2-3 schools; for sets of 4-5 schools, 
Kappa = .22: and for sets of 6-9 schools, Kappa = .09. 
Clearly, the discriminations were much better for m 
sets, where the rater was operationally closer to healt 


Reliability /Stability of EFI Patterns of Operational Ums 
at Different Levels 


Reliability vs. Stability 


an dardized 
An important aspect of instruments such as standa 


achievement tests or personality inventories is their - 
retest reliability, meaning the extent to which an indivi is 
ual’s performance is constant over a period of time p^ 
long enough to mitigate the effects of memory, t 
momentary set, etc., but not so long that the enar 
the personality of the individual have changed aps 
The Social-psychological field depicted by the EFI is co 
stantly changing because of changes in the external pd 
vironment as well as in the individual raters. But if sss 
force field plots are based on a larger number irae 
idiosyncracies of individuals tend to average out, — 
the effects of external change to be reflected all the nl 
clearly. Therefore, in referring Lo consistency of m » 
terns from one testing occasion to another, it is mor“ ae 
Propriate to think in terms of “stability” rather than 
liability.” 


the 


over 
Study III: Stability of Individual and Group Patterns 
Four Weeks 


the 
2. š of 
Arrangements were made for two administrations — 
EFI in one of the districts. All teachers and eem in 
sistants in District 4 were asked to complete the EFI; 
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Table 3.—Number of “Hits” in Se: 
cond Roi i 
Force Field Plots, by Program Advisor icu ae ene 


Number of | Number of 
o. Program | Schools/ | E d Nunber of 
District Advisor | Plots to pred Mader of [etat Ri 
: Nuzber | be Matched |chance Alone | "Hits" LU ries 
ance 
s 1 2 | 1 2 1 
2 2 1 2 1 
3 2 1 0 E 
12 1 3 1 3 2 
2 2 1 2 1 
13 1 2 1 2 1 
2 2 1 2 1 
3 2 1 2 1 
Total (3) (8) 17 8 | 15 * 
annuam 
Table 4.—Number of Hits" in Round One and Round Two Combined, 
by the Number in the Set to be Matched 
Number of 
Number in Number of "Hits" Ex- rre 5 
the Set to Schools to pected by Number of Excess 
of Chance Kappa* 


be Matched | be Matched | Chance Alone | Actual "Hits" 


ie 2 .87 
gag 22 .22 
6-9 27 " 


n Excess of Chance + Potential Improvement. 


* Kappa = Hits i 


dividual Patterns of Responses over Four Weeks, by Task 


Task A | Task B Task C 
7 


Number of Correlations Averaged 37 2 29 
Mean of Corresponding z -Scores .68 .88 .84 
elation Ratios .59 7 .68 


Corresponding Corr’ 
ns were: Task A, 95: Task B, .98; and 
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tee nses obtained from each individual on each of the . In the usual psycho ogical test the responses on eae i 
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gps i In the EFI, the number of individuals averaged to get a 
y the number of res- 


attern is limited only b 


ack to correlation 
unit of interest. 


COnye i 
(3) "ring correlation ratios to 2-SCOr 
tati : 
particular p 


takir 
o 1g the mean; and (4) converting b 


Stability te results are presented in Table 5- The average e 

Coeffic; : 4 pondents ; I 

reip range uu ane : The average test-retest correlations obtained for the two 
d the district as à whole, represent 


or the district a5 à averag 
asks classes individuals an 

he three tasks. Item ) à x 

$ and the correla- two points alor t size that includes: in 

dividuals, classes, 


hus obtaine 


tinuum of uni 


wh index of test-retest stability f 
districts; geographic areas, etc. 


ng acon 
schools, 
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.—Average Test-Retest Correlations Over a One-Year Interval for EFI Tasks A, B, 
Eu render by Various Levels of Organizational Unit 


Average Over 
Task A Task B Task C rires Tasks 


Total, All 13 Districts 
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No. of Elements Averaged 1 1 fl 3 
Mean of z-Scores 2.75 3.05 2.86 2.89 
Corresponding Correlation Ratios +99 1.00 -99 -99 
Individual Districts 

No. of Elements Averaged 13 13 13 39 
Mean of z-Scores 1.66 1.69 1.73 1.69 
Corresponding Correlation Ratios -93 -93 .94 .93 

i i - e e 
Standard Deviation of z-Scores +26 +33 +20 27 *There were 119 teachers, but each or e 

Schools tasks was left incomplete by one teacher, 
ferent one on each task. 

No. of Elements Averaged 52 52 52 156 
Mean of z-Scores 1.14 1.20 1.09 1.14 
Corresponding Correlation Ratios .8l .83 .80 .82 
Standard Deviation of z-Scores .33 51 .42 .43 


No. of Elements Averaged 131 129 131 391 
Mean of z-Scores .78 .89 .73 .80 
Corresponding Correlation Ratios .65 .71 .62 .66 
Standard Deviation of z-Scores -39 51 41 “45 
Classrooms 
No. of Elements Averaged 119 119 119 357 
Mean of z-Scores ; " m .84 72 Wi 
Corresponding Correlation Ratios .63 .69 .62 .65 
Standard Deviation of z-Scores 37 .52 .39 .44 
Teachers 
No. of Elements Averaged * 118 354 
Mean of z-Scores . 4 73 
Corresponding Correlation Ratios 63 162 


Standard Deviation of z-Scores 


Figure 3.—Relationship between Size of Organizational Unit and 
Stability of EFI Patterns Over One Year: Mem of Three Tasks 
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The f; 
Perse e 2 these two points are consistent with the 
E est-retest stability of the EFI increases with 

T ase in the size of the unit surveyed. 
woe generality of this effect, a special analysis was 
essi ve yel «(nfl data collected in two suc- 
seven] iind i ices of year-to-year reliablity/stability for 
čulsted. The i aag the continuum of unit size were cal- 
ri ei a were collected in surveys conducted in the 
inis as and of 1974 as part of the regular monitor- 
ui Fidis cn) implementation. In both years, the survey 
itor * T the 300 teachers and the 300 teaching 
District 4, v; a Responsive Education Program. However, 
rini Pe not included in the spring 1974 survey be- 
exiles asoa s wd been surveyed twice a few months 
the PERA FoI Study III. These two administrations of 
Shoal En ty yielded year-to-year data points for 13 
withi ricts, with 52 schools, with 132 grade levels 

n schools, and with 119 classrooms. 


ees oe eee no classroom is the same from one 
aliases — For purposes of this analysis, however, a 
level, and Vn considered the same if the school, the grade 
iuo ^ teacher wre the same. There were 119 such 
Could be ea which the returns of at least the teacher 
dias Lo to the same individual at both time points. 
the doro ge the teaching assistant in the classroom was 
trition a also returned an EFI form at both time points. 
iame ig due to respondents opting to remain 
ctor, a, or to their decision not to participate. These 
cluded in ens the number of elements that could be in- 
Classroom a analysis {rom among the individual and the 
aggregatio categories. In the case of the larger units, where 
tification ee feasible and meaningful without iden- 
'sons odia a ts, year-to-year compar 
the es e made without regard to whether any or I 
ue a were the same at the two time paints, 
the re est correlations were computed for each task: 
Bethe;. : all 13 districts taken to- 
School 3 districts, 52 
teach 119 classrooms 
and | here available) 
Table 6. 


Th 
] lere jc 4 : : . 
tions a * is a consistent decrease in the size of the correla 
à strict, to school, to 


Stad, 

ences The size of the differ- 
e is always the 
verage correla- 


ponses aggregated over 
» and individually for each of the 1 
ipi grade levels within schools, 
] ^en current teaching assistant wh 
eachers. The data are presented in 


hat move from program to di 
Ce vari el, to classroom, to teacher. 
Samo. 1165; but the direction of difference 
< -2 Smaller the unit, the lower the a 
_ "S trend is charted in Figure 8. 


Lis 
d a Pecial interest that the stab 
Person, ers than for classrooms. If the 
Sheet jy rather than inter-individual factors, we wou d 
ME E the reverse, The sample of teachers was defined 
"efle 9 insure that the scores at both time points would 
AM. 8 re. * Same individuals, whereas the scores for class- 
ants a Present averages of teachers and teaching ass! | 
“als at à the latter are for the most part different individ- 

“two time points. This result is another indication 


tion 
ility indices are lower 
EFT reflected only 


that the EFI faithfully reflects the reality and the constancy 
in the educational setting, rather than any purely personal 
or a-situational factors impinging on the respondents 


Summary and Conclusions 


In light of these four studies, the psychometric prop- 
erties of the EFI instrument are seen to be very much i 
keeping with the purpose for which it w: i all E 

à 2 as originally de- 
signed: systematically mapping the educational setting con- 
sidered as the work-space or the psycho-social field of 
forces of the classroom teacher. 

Study I, and in some sense Study II as well, demon- 
strated that the degree to which the patterns of two dif- 
ferent raters are similar to each other is directly and con- 
sistently related to important dimensions which define the 
environmental setting considered as the work-space of the 
classroom teacher—the dimensions of administrative- 

eographic distance and professional role. In other words, 
the EFI does differentiate the field of forces within the 


teacher's work-space along lines that are significant to the 


teacher. 
Study II showed that force field patterns generated by 


the EFI can be referred to the educational setting—that 
these patterns are confirmed by the independent per- 

owledgeable individuals. 

dy IV showed that the EFI patterns 
schools and districts (r’s of .80-.94) 
for classrooms and grade levels (rs of 

reases with increase in 


ception of kn 

Study III and Stu 
are highly stable for 
and moderately high 
.62-.71), and that the reliability inc 


the size of the unit sampled. 
In education, as in any institutional or group enterprise, 


individual and group change is most effective when norms 
or standards regulating behavior are changed. When a norm 
is changed, group members change their behavior to con- 
form to the new norm. On the other hand, attempts to 
change group or organizational behavior by changing in- 
dividual behavior often results in resistance to change, 
particularly when an individual perceives that the change 

is not endorsed by his peers. Thus, the primary focus in 
the development of educational programs and the improve- 
ment of the educational setting should be normative 

dual change a by-product. 

] survey does not bring about 
unless we count the 

for harried teachers 
fied in terms 


change, with indivi 
In and of itself, the EF 


changes in the educational setting, 
with an opportunity 


changes that come a 
to ventilate. Once the situation has been spec! à 
of an EFI pattern, however, remedial action regarding 
factors related to norms and/or the work setting may be 
ill have been more clearly 


obvious; at least the options w 


defined. . 
At the Far West Laboratory» force field patterns are 
d as an important component of the process o 

based criteria 


luation. Unlike performance-baset * 
t scores or teacher behavior indices, they 


hat help is being or can be provided the 
er—teacher, teaching assistant, prm- 


being use 
program eva 
such as child tes 
relate directly to W 
educational practition 
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cipal, curriculum specialist, program developer, or other 
change-agent. A report elaborating the specifics of using 
the EFI for formative evaluation is in progress. 


FOOTNOTES 


1. The description of the EFI given here reflects the format of 
the instrument in its current form, which was in use when the data 
reported here were generated. The previous publication (2:27) was 
based on an earlier version that differed as follows: 

a. Force #7 was Statewide Mandates on certifications, 

curriculum, grading, etc. 

b. Force #10 was Curriculum Personnel, such as reading 

specialist, art teacher, etc. 

c. Tasks were One, Two and Three, rather than A, B, and C. 

d. The first task was a ranking, rather than the Task A rating 

from 0 - 9. 
e. The range of ratings on the third task was 1 - 5 rather than 
0-9. 


JOURNAL OF EXPERIMENTAL EDUCATION 


2. For a description of Follow Through, see (2:34). 
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THE EFFECT OF ENCODING AND AN EXTERNAL 
MEMORY DEVICE ON NOTE TAKING 
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ABSTRACT 


College students were randomly assigned to seven note-taking and review conditions in order to determine the relative 
importance of the functions of encoding and either an externally provided or a personally produced memory device. Results 
of the post-test showed that a combination of encoding and reviewing either one's own notes or an outline of the lecture 
produced the best recall scores, while either personally encoding notes or being provided with a lecture outline during the 


lecture accompanied by “mental” review produced the least recall. The findings are discussed in terms of practical suggestions 


for professors and their students. 


A UNIVERSAL CHARACTERISTIC of college students 
is that they report for class carrying a notebook in which to 
take notes on material presented during class lectures. 
Students put a great deal of effort and faith into the taking 
of notes presumably so they can be used later while review- 
ing for exams. Despite this widespread practice of taking 
notes, little experimental evidence exists as to the exact 
functions of note taking. Note taking appears to serve either 
or both of two functions: an encoding function in which 
the material heard in a lecture is transformed into a per- 
sonally meaningful form, and an external memory function 
which serves for later review. DiVesta and Gray (1) and 
Howe (3) found that the encoding function was the more 
important of the two functions. These authors argued that 
too much reliance on notes as an external memory device 
can result in inefficient learning if the crucial encoding 
stage is bypassed. Howe (3) suggested that if the only 


n formation 
d be more 
f the 


o react to 
that 


exter? 


function of notes were the external record of i 
provided by the professor during class, it woul 
efficient to hand out mimeographed outlines o 
lecture before class so students would be free t 
other things. Fisher and Harris (2), however foun 
of the two possible functions of taking notes, thé " recall- 
memory device had the greatest facilitating yin i er 
They found that Ss who only made use of notes » recall 
ternal memory device performed better on a et E not? 
than Ss who benefited from both of the function’ " 
taking. 

The purpose of the present study w 
the relative importance of the note-taking 
coding and an external memory device, as well 
combinations of these two functions, by manip 
note-taking and review conditions of the study. mor | 
attempted to assess whether the better external met A 


al 


E gate 
as to inves? on 


function? z us 
] as vario", 
a 4 the 
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Vice was one 
personally produced by th 
ternally provided by the professor. y the students or ex- 


Method 
Subjects 


Eighty-fi 
"s. os = = enrolled in four sections of a soph- 
Sa Onerandori rowth and Development course served as 
T nen , selected section served as a control group 
note taking = ents received no instructions regarding 
sinet d were provided no opportunity to review 
Tce A m to an examination on the content of a 
modification — lecture on the principles of behavior 
by the Eni presented to each of the four sections 
Behavior ma sn who was also the regular instructor. 
Strate € mee isa topic ordinarily covered in the 
Bun aa re of about this length, but it was not dis- 

in any of the required readings. 


Pr, 
Ocedures 
d: in one con- 


Two 
— " m 
te-taking conditions were employe 
and in another 


itio; 
condition stad. were instructed to take notes, 
Notes. These a were provided a copy of the lecturer’s 
four different o note-taking conditions were paired with 
their own às iic conditions. Some students reviewed 
à copy of neu encoded notes (RON); others reviewed 
external me ‘i ectarey s notes (RLN) which served as an 
Notes and rima device; others were provided their own 
Still Others copy’ of the lecturer’s notes (RON + RLN); and 
; were instructed to review the material mentally 


Th 
Were s Ber dei. conditions and the review conditions 
Condition NE to form the following seven treatment 
Notes (NR (1) Ss took notes and reviewed their own 
the lecty ON); (2) Ss took notes and reviewed a copy of 
Viewed s notes (N-RLN); (3) Ss took notes and 
Seth ie (N-MR); (4) Ss took notes and reviewed 
LN); 6) Pin notes and the lecturer's notes (N-RON + 
an review 's were provided a copy of the lecturer's notes 
(6) Ss wer ed a copy of the lecturer's notes (LN-RLN); 
Viewed m * provided a copy of the lecturers notes and re- 
Which : entally (LN-MR); and (7) the control condition in 
Provided received no instructions on note taking and were 
thos. gw time (NIT-NRT). 
in avoi ee Sections were available, 
we, Milar tr tivity of experimental arr 
die Sitter e conditions to the $2 
i treat Y assigned to combinations 0 i 
Sty, Ned to ments, In class one, students were randomly 
[: den Conditions N-RON or N-RLN; in class two, 
N-MR or LN-MR; in 
onditions LN-RLN 
rised the 


angeme 
me section. Classes 


f the two most 


Control group. 
cture students 


dification le 


“NR 
the d 
"ee of d of the behavior mo $ 
€ sections were provided a packet which con- 


Table 1.—Analysis of Variance for Short Answer Items 


Source of 
Variation 5 "s Es 3 
F 
Bet 
letween Groups 135.87 6 22.6 
" zl 
Within Groups 229.35 78 E^ 
2.95 
T 
‘otal 365.22 8 


“F o1(6/60) 2-12 


Table 2.—Analysis of Variance for Objective Items 


ama JIJNS E. O 
Between Groups 31.28 6 5.21 1.83 
Within Groups 221.71 78 2.84 

Total 252.99 8 

F o5(6/60) ^25 


Table 3.—Analysis of Variance for Total Scores 


Source of ss oF * j 


6 27.45 3.84" 


Between Groups 164.67 


Within Groups 556.98 78 7.14 
Total 721.65 B 
F o1(6/60) 2? 

te to their treat- 


tained instructions and materials appropria 
ment condition. Students in the fourth section were not 
provided any instructions or materials. Following the 
lecture all students were asked to turn in their notes so that 
the instructor might assess the effectiveness of the lecture. 
Two weeks following the lecture a review session and an 
examination were administered. Prior to a 10-minute review 
period each student received a packet compiled according 
to his treatment condition. Students in the RON review 
condition received their own notes for review, students 


in the RLN condition received only a copy of the lecturer's 
and students in the RON + RLN condition 


own notes as well as a copy of the 
dents in the MR condition were in- 
ecture on behavior modification 
«mentally " (i. e., sit and think about the material). The 
control class received no time for review. All Ss were given 
a 21-point examination consisting of 10 objective questions 
short answer questions worth a total of 11 points. 


notes to review, 
received both their 
lecturer's notes. Stu 
structed to review the | 


and 5 


Results 
A single-factor, unweight 
was performed on the depen 


mber of correct respons ] 
short answer part of the examination, 


and number of correct responses on the objective items. 
A significant treatment effect was found for the total score 
= 3.84; df = 6/18; p € .01), and for the number correct 
= 6/08; p < .01), 


rt answer items (Fe Taf E OMe PR 
mber correct on the objective items 


but not for the nur 
(F= 1.83; df = 6/78; p 7 .05). See Tables 1, 2, and 3. 


ed-means analysis of variance 
dent variables of total test 
es), number of correct 


score (nu 
responses on the 


on the sho 
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Table 4.—Means and Standard Deviations for Short Answer, 


Objective, and Total Scores by Treatment Condition 


i SS S 
Du mE DEC I ee ee ru DD ME RC M MURUS 


Treatment N Short SD 
Condition jk 

N-MR 10 2.8 1.14 
LN-MR 10 3.0 1.25 
NIT-NRT 2+} — 3.33 1.55 
N-RON 13 6.31 1.89 
N-RLN 12 5.33 1.78 
LN-RLN 9 4.78 2.49 
N-RON+RLN 7 5.29 1.89 


Objective sD 
X X 


7.3 1.83 10.1 2.38 
7.7 1.49 10.7 1.89 
6.38 1.44 9.71 2.27 
6.46 2.44 12.77 3.75 
7.17 1.53 12.5 2.11 
7.22 .97 12 3.16 
8.29 1.8 13.57 31. 


= O_O PM 


The mean scores and standard deviations for the short 
answer items, objective items, and total scores within each 
of the seven treatment conditions are presented in Table 4. 


Discussion 

The results of this study suggest that both note-taking 
functions are important, but that the more important 
function for success in later recall is the encoding function. 
Examination of the three top conditions for both signif- 


icant dependent variables (N-RON + RLN, N-RON, N-RLN) 


indicated that the most important function for success on 
an examination was the encoding of a personal set of 
notes, but that it made little difference whether the ex- 
ternal memory device was externally provided or personally 
produced. Apparently the substitution of a copy of the 
lecturer’s notes for the S’s own notes in the review process 
does not cause any interference. The fourth performing 
group for both dependent variables was the LN-RLN con- 
dition, which did not benefit from the encoding function 
but was provided with an external memory aid. The three 
groups with the lowest total score and the poorest recall 
on the short answer items either had no external memory 
aid or were in the control condition. Mental review does 
not appear to be very successful regardless of the circum- 


stances. 


Several practical suggestions for both professors and 
their students arise from this study. The process of 
personally encoding the lecture through the taking of notes 
is very important for success on tests of recall. Although; 
a5 this study suggests, the student's grade on an examin- 
ation may depend on his skill as a note taker, very little 
attention has been given to training in this skill. Additional 
research is needed comparing various specific methods 
of taking notes (i. e., record main points only, record as 
many details of the lecture as possible) so that students 
can be instructed as to the most efficient and effective 
method for taking notes. An external memory device for 
review also is important, but it does not seem to matter 
whether this device is the same set of notes personally 
prepared by the student or a copy of the lecturer's notes. 
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ACHIEVEMENT MOTIVATION TRAINING 
FOR LOW-ACHIEVING EIGHTH AND 


TENTH GRADE BOYS 
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ABSTRACT 


to 24 teachers in San Mateo County, California. They then trained 


g the fall school semester). Student training was conducted in two 
YMCA camp. The remaining students were trained in their local 
thematics, English, and social studies over the school year 


13 prin pic as motivation training course was given 
settings, Al and tenth grade students (four weekends durin; 
gs. About one-half of the students were trained in a local 


Scho i á 

sh aoa ating Evaluation of the trained students’ grades in mai 
Evaluation Paner students performed significantly better in mathematics than a randomly selected control group of students. 
of pre- and post-training standardized test scores in science and social studies showed that the trained students 


perfi abire 
ormed significantly higher on the science tests than did control students. 
en approaching a challenge, the person high 


AS 
iae eat ar of the development of the possible; wh 
"esearch dili for Achievement (10) and subsequent in n Achievement seeks out ways to take innovative action; 
een develo ing with the achievement motive, there has and finally, those high in n Achievement thinking appear to 
Ourse, Th ped an achievement motivation training (AMT) search out positions of responsibility, namely, action areas 
otivati e training course is an effort to change the where they can feel that their individual efforts in either 
*senoe On tension system of the course participants. In group or personal goal-directed activity will make a dif- 
» AMT is an answer to the question, “How does one ference. The term which best captures the essence of the 
achievement motive construct validity picture according to 


y toned McClelland (11) is “entrepreneurship.” 
f a person It is beyond the scope of this paper to describe in detail 
e network the AMT course. Various versions of AMT have appeared 


elsewhere (3;13;16;17). Generally, the course involves a 
of activities carried out in a 


Bo ab 
elo tivating people?” 

Ratios has called motives, “affectivel 
(12; n aid in the cognitive makeup 0 
fa Perso he saliency of a motive in the cognitiv 
¿erson can be measured by counting the number of 


, OClati 
un act to the motive cluster. The asso- i f animum of four to five days 
mo tence has ere cake compaho, with ee live-in residential setting. The residential setting helps to 
Ahi. €, and the dnd NE cq onthe a sane create an atmosphere of total involvement away from the 
ciati ement moti ought sampling technique 0> m° á E humdrum of everyday activity. The format for most of 
Ons clustered Natoms designed tatap Bin ds the game activities is performance In the activity followed 
the is an ed around competitive actors l ? i by a discussion period in which each participant is en- 
thin, liency e tempt to increase the precision and, thus, couraged to review his thoughts, strategies, and reactions 
Coy king. x age for Achievement (n Achievement) in the activity. 
Pei © partici sina furthermore, 16 designed = a m Perhaps two examples of AMT inputs will help the read- 
Phi eS to ipa "s ew eer st reer er visualize its nature. Course participants are first cee the 
: a ine the workings of t 10k A re then taught to 
aot " ingo eland (11), in a summary of n Achievement uu n iie poeseos ix ^ Pr 
M Certain waya ipd ic n pericu ae 2: des Tit to make precise the definition of the 
Ne nel vince yeh Vai Samana Participants are encouraged to make 
tegories in the discussion of course 


8 
oe involved u : i 
to » they take moderate risks; their strategy : à 
jargon use of scoring cà 


* to get 
and use as much concre 
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activities. Theoretically, the principle behind teaching the 
scoring system is similar to that of psychotherapy. The 
more one verbalizes about an experience, the more the ex- 
perience may become meaningful. 


An example of the game activity is the Ring Toss Game. 
Each participant is given an opportunity to toss a number of 
rings over a peg while standing at any distance from the 
peg that he chooses. After several turns the results of each 
participants efforts are recorded for all to see, and each is 

asked to discuss his or her strategy. The discussion readily 
reveals the moderate risk-taking strategy and strategies that 
avoid the challenge of the game. Those who stand close to 
the peg, a low-risk distance, are maximizing success possibil- 
ities but incur the good natured wrath of other participants 
as expressed in the comment, “Anyone could do it from 
there." Those who stand too far away from the peg, a 
caution to the wind or high-risk distance, seem to be 
hiding behind the comment, “I tried but no one could be 
expected to succeed from that distance.” If successful, 
however, they reap the paradoxical benefit of either being 
considered very skillful or, more likely, being chided by the 
comment, “You were just lucky.” The point is that mod- 
erate distances are those which offer the most challenge, 
and it is at moderate distances where those high in n Achieve- 
ment have been found to stand (11). The Ring Toss Game 
and other similar experiences offer participants concrete 
opportunities to witness the workings of the achievement 
motive both in themselves and in others. The question, 
“Am I a person high in n Achievement or do I want to be?” 
is brought squarely into focus. 


AMT, although principally used in the training of 
businessmen (13), has found its way into the educational 
system. Burris (1) launched the AMT effort by his initial 
attempt to improve the school performance of college 
students by counseling them using the n Achievement 
scoring system as a guideline. His results indicate that his 
counselees’ grades improved. Kolb (9) found AMT effective 
in increasing the grades of bright tenth grade students from 
upper socioeconomic homes. Recently deCharms (4) has 
reported evidence of increased school performance of 
black sixth graders in St. Louis as a result of being taught 
by teachers who had experienced AMT and who structured 
some of their regular teaching units around the AMT ex- 
periences. 

The present study was designed to test whether a group 
of teachers given a revised form of AMT could, in turn, by 
training students outside of their regular school activity; 

enerate a measurable improvement in their academic per- 
E —Ó Simplified, the chain of reasoning was: AMT will 
- d to increased n Achievement thinking, which will lead 
e à ed achievement behavior, which will result in im- 
ipee nd ideie performance. The study was mainly de- 
tud the hypothesis: AMT will have the effect of 
hievement motivation of students and lead 
their academic performance. 


to 
proved ac: 
signed to 
increasing the ac 
to improvement in 
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Method 


First, the study design involved training for a group of 
24 teachers during the last two weeks of their summer va- 
cations. Two training sessions of four and one-half days 
each were conducted. Each session was attended by twelve 
teachers, an expert trainer and the author.? The teachers 
then trained students. Students were given AMT in two 
settings. One group of students recruited from the schools 
participating in the study was given the training in a live- 
in camp setting, a local YMCA camp in the area. Another 
group of students was given AMT in their local school set- 
ting. À third group was given no training. Students were 
selected from both the eighth and tenth grades. The train- 
ing of students was conducted on four weekends during 
their fall school semester. 


Subjects 


Officials from three high schools and five intermediate . 
schools agreed to cooperate in the study. The schools were 
located in northern, central, and southern San Mateo 
County, California. A ify 

The first step in the selection of students was to ident) 
average ability, low-achieving students from each cooper” 
ating school. The cumulative records of the schools inc 
intelligence scores for all students. For each school a 
regression coefficient was computed based on an uoi . 
of the students? grades in English, social studies, and cni 
ematics and their intelligence test scores. From each schoo 
those students with intelligence scores between 85 and 17 
who were going into the eighth and tenth grades, and e 
were not achieving up to their predicted levels as indicate 
by the overall achievement of the beginning eighth and b 
tenth graders in their school were identified as low achiev 
ers? From the low-achieving group of each participating 
School a control group was randomly selected. Following 
the control group selection, volunteers for AMT -— 
recruited from each of the participating schools. Sixty- 
four eighth grade students and 78 tenth grade students 
volunteered (average age: eighth grade—13 years 10 . 
months, tenth grade—15 years 9 months; average IQ: 
eighth grade—105, tenth grade—104). The average zi 
for the tenth grade control students was 105 (n = een as 
The average IQ for the eighth grade control NURSE ‘sl 
also 105 (n = 30). The difference between experimen 
and control students? IQ was not statistically signi 

Students who volunteered for AMT might be con® e 
more motivated to start with. However, nearly pu tm 
who was asked to participate accepted. Those who pi 
gave other commitments (work, athletics, etc.) as rea 
for their refusal. Initially, at least, differences in EN in 
formance after training could not be due to Que 
desire to participate in AMT. 


Juded 


ficant- 
idere 


Procedure 
S for poth 


ade W er? i 


There were six separate experimental group 
eighth and tenth grade. Three groups from each 
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trained in a c. set gro fr each grade actual Form X scor e ts were 
amp setti P: 
P ng and three groups irom g d X e, and then 50 to 
n n points w added 


were trained in thei i 
imber t p is “pel trained teachers eliminate negative sc 
a ih s (od peine groups. Students Grade Mri maaan (GBA 
assigned to the cam ns sch Mone were randomly the same manner -— June ven oct racer 
Pur Aedui im s ool setting. Students assigned June grades after Nem gi Én. bb ani dese eq 
Seteday € g were ussed to the training site the social studies. Th dE e 

orning of the training weekend and returned same pasar th im ardi ru as 

s the standardized test scores. 


Sunda ; 
y evening. Students assigned to the school setting 


spen " 
sii and Sunday of the training weekend at Resul 
ightot the —— but returned home the Saturday eas 
Ühezutigpand ene weekend. The only difference between A3x 3x 2analysis of variance was applied to eval 
miod war pus training format was that the camp the main effects and interactions among : ee 
Ob ‘a experience some of the live-in impact of traini ing, grade level SUUM e St 
viously, the experimental arrang pact of ANDE shows evel, and school nested in grade level. Table 1 
aet ifie Hooda p arrangement was designed to Shows a comparison of the gain score means for exper- 
llores pact was an essential part of the train- imental students who attended at least three pao with 
Ta NE = ce din means for the control students. Table 2 
ifieghenfhn A : experimental and control students shows the analysis of variance for the science scores 
enana an ord Achievement Test, Form W, presented m Table 1. Inspection of the mean scores for 
and social studies tests, Part A. In the late social studies test scores indicates no support for the 
predicted training effect. The science scores were in the 


Inspection of Table 1 shows that the 


Spring » 
£ of the following semester; Form X of the same test 
trained groups scored higher on the science tests than did 


batte 
T ss : 
VEE d was administered. Experimental and control students 
rought together in a single testing session and told 
the control group, and Table 2 shows that the differences 


the te: 
" : 
the E were part of the school testing program, but that : 
res would not be shown on their school records. The in performance exceeded chance expectancies. 
Table 3 shows the June grades of those students who at- 


Score; ; 
S us i . " 4 
ed in the analysis of data were residualized gain 
tended three training sessions. Inspection of the means 


predicted direction. 


Scores 
Usin computed from the regression of Form X on Form W. 
g the initial Form W scores and the regression coat shows that only the mathematics grades were in the pre- 
dicted direction. Table 4 shows the residualized gain score 
des. Table 5 shows the analysis of 


ed for each student. : 
for the mathemat'^s gra 


m the student's i ; 
variance for the gain scores In mat 
Table 5 shows that the trained gro 


hematics. Inspection of 


fici 

lent è E 
, a predicted score was determin 

ups’ gain scores exceeded 


he predi 
predicted score was subtracted fro 


chance expectancies. 

Attendance figures indicate that 34 of the original 
136 students who volunteered and who attended the be- ( 
training sessions dropped out of training before 


Students Who 
d or fourth session. That drop-out rate (25%) is 


Table 
1.—Effect of AMT on Gain Scores for 
Sessions ginning 


Atte; 
nded at least Three of the Four Training 


the thir 
Training Group Mies cid a je not astonishing when one takes into account that students 
E a li ji were asked to volunteer their weekends for training. It is 
Canpus obvious, however, that cooperative attitude reflected in 
Camp a 49.6 5.8 50.8 6-2 willingness to attend sessions was a source of error. 
fitr, ii ane sa ma m The significant school nested in grade level main effect 
= 49.3 5.0 LN shown in Tables 2 and 5 will not be discussed. This was a$ 
expected, due to the different composition of student 
body of the schools selected for the study, and is of little 
theoretical interest. 
Tab 
Who gc Analysis of Variance of Science Gain Scores for Students Discussion 
ed at least Three of the Fou Training Seton Certainly, the results indicate only marginal support for 
SOURCE SS the hypothesized effect. To some extent, Fn 1 was 
ML Er Z eee TES expected. The AMT course was in many of its €: ‘dl 
NAT ^ : » ai than ideal. Teacher trainers were considerably ess than ex- 
ae E ; ne — pert in their training skills. As it was their first —_ - 
IM A r^ a fort, the “bugs” were not yet out is ne pele v: 
ERROR M ad 1.08 the training itself was done on weekends, t n al in- 
* dus os volvement 80 important for the self-study and inter- 
personal support facets of the training was weakened. 
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Table 3.—June Grades for Students Who Attended at least Three 
of the Four Training Sessions 


English Social Studies Mathematics 
Training Group N x x * 
cspus 45 5.5 5.9 5.5 
Camp 47 5.8 5.6 6.1 
Control 69 5.8 6.2 4.9 


Table 4.—Effect of AMT on Mathematics Grades Gain Scores for 
Students Who Attended at least Three of the Four Training Sessions 


Mathematics Grades 


Training Group N x SD 
Campus 45 52.3 20.3 
Camp 47 59.6 22.6 
Control 69 47.0 19.8 


Table 5.—Analysis of Variance of Mathematics Grade Gain Scores 
for Students Who Attended at least Three of the Four Training 


Sessions 


SOURCE df MEAN SQUARE F 
GRADE LEVEL (A) 1 154 PU 
SCHOOL (NESTED) (B) 4 1207 2.76" 
TRAINING (c) 2 1690 3.87* 
AxC 2 395 . < 
AxBxC 8 679 1.78 


ERROR 143 436 ‘ 
IR 000000 ————————————— 


p<.05 


The two-step process of training, that is, teacher being 
trained and then training students, in spite of the weak- 
nesses in the arrangement, was deemed necessary to show 
the value of AMT for school use. Previous in-school use of 
AMT involved the direct contact between the expert 
trainer and students. The present study and the one 
conducted by deCharms (4) were conducted to show that 
training skills could be put in the hands of teachers after 
minimal exposure to expert training. If they, then, could 
use those skills to influence student academic performance, 
the initial foundation for curriculum innovation would be 
laid. 
The results in the present study, the data configuration 
when viewed in the context of the n Achievement con- 
struct validity picture, does make sense. Kagan and Moss 
(8) found a positive correlation between n Achievement 
in middle childhood and skill at constructional activities. 
McClelland and Winter (13) pointed to the fact that con- 
structional activities furnish the concrete feedback concern- 
ing performance that those high in achievement motivation 
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desire. For example, the difference in feedback in building 
a radio set and handing in a theme is related to the need- 
for-feedback predisposition of the person high in achieve- 
ment motivation. In the radio set case, the completion of 
the last solder signals the turning-on of the set. It either 
plays or does not. In the theme case, of course, the feed- 
back is not immediately forthcoming, and, perhaps of 
more importance, is not as completely controlled by the 
person handing in the theme as it is in the radio set con- 
struction. Both science scores and mathematics scores, it 
could be argued, are sensitive to the person's need for con- 
crete feedback. The science test items are a sample of con- 
structional activity-related knowledges. Mathematics as an 
exercise permits immediate feedback about performance 
for anyone conscientious enough to check his answers. 
Apparently the training, where effective, was particularly 
appealing to those students who preferred to operate in 
task areas where they could gain immediate concrete feed- 
back about their performance. 

Most researchers involved with AMT have been hard 
pressed to describe exactly what changes among trainees 
take place in training. Personal experiences with AMT have 
suggested that the training cultivates a person's feeling of 
personal effectiveness. McClelland and Winter (13) in dis- 
cussing the reactions of businessmen to training have de- 
scribed the training effect as an upsurge of certainty that 
one can control one's destiny, a feeling of personal efficacy: 
DeCharms (2), building on Heider’s (6) notions of internal 
locus of control, has singled out the Origin-Pawn dimension 
as pertinent to the discussion of the achievement motive 
and human motivation in general. The Pawn feels pushed 
around by controls from outside; the Origin feels that 
he himself is in control, that is, he is controlled from with- 
in. Rotter (15) has described facets of the Origin-Pawn 
dimension in his discussion of internal and external control 
of reinforcement. These motivational variables (or single 
variable) seem to be a facet of the instinct to master (7) 
and effectance motivation (18). In other words, there 18 
apparently a theoretical affinity between n Achievement 
and a basic human desire to feel personally in the driver $ 
seat of one's vehicle of destiny. Apparently AMT creates 
a series of peak experiences which arouse in trainees à 
sense that they can do more to control their destiny- 
Couched in the study of achievement strategies, that feel- 
ing, at least for the students involved in the present em 
apparently manifested itself in subject matter areas whie 
were sensitive to the need-for-feedback and Origin Pre” 


disposition engendered by AMT. 


Several implications of AMT are of concern bl er 
volved in this type of applied research. One is the mag n ex 
of change that can be effected as a result of AMT. Upo 
amination of the changes in grades represented by of the 
coded scores reported on Table 1, they are seen ie E 
D* and C— to C magnitude. One wonders about the * 
penditures of training energy in relation to expecte 


change. In change efforts which deal with motivation: 
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ABSTRACT 


An experimental group of 20 inservice elementary teachers was trained using the Utah State University Classroom Management 
Protocol Modules, and compared before and after training with a control group of 9 teachers. Although the experimental teachers 
received more favorable post-training scores on all 13 classroom management behaviors covered in the modules, the differences 
were generally small and nonsignificant. The level of work involvement and deviant behavior of pupils of the experimental group 
teachers was also compared before and after the teachers had been trained. In recitation situations, pupil work involvement increase 
and deviant behavior decreased significantly. In seat work situations, pupil work involvement increased significantly, but no 


significant changes occurred in deviant behavior. 


THE PURPOSE OF this study was to determine 
whether the Utah State University Protocol Modules that 
are designed to improve the classroom management skills 
of elementary teachers brought about significant changes 
in the teacher use of these skills and also changed the 


amount of on-task and disruptive behavior of pupils in 


their classes.’ 
Specific behaviors taught by the U. S. U. Classroom 


Management Modules were drawn primarily from a corre- 
lational study carried out by Kounin (5). In this study, 
Kounin collected videotapes in 49 elementary school class- 
rooms. He identified two pupil performance criteria, work 
involvement and deviant behavior. Eight pupils were 
selected for observation in each classroom, four boys and 
four girls. Each child was scored for work involvement and 
deviant behavior every 12 seconds. Work involvement was 
ed in three categories: (1) definitely in the assigned 

scor (2) robably in the assigned work; (3) definitely out 
wee [acne work. Deviance was also coded by Kounin 
A; in» categories: (1) not misbehaving; (2) engaging 
into behavior: (3) engaging in serious misbehavior. 
in mild misben rs related to classroom management were 
Teacher € vert group of observers in Kounin's study. 
scored by a differ ocedures differed for the different 
The observ: ih some tallied on 6-second intervals, 
teaching be Jal 30-second intervals. The obser- 
while others W «ed in the study reported herein dif- 
ree procedures : bs Kounin and will be described 
fered from those used 9) summarizes correlations ob- 
later in this pap‘ P „her behavior and pupil work 
Kounm 


ational pr 
haviors wi 
ere ratec 


e us 
er. Table ! 
een teat 


tained by 


involvement and deviant behavior for those variables 
that were covered in the U. S. U. Protocol Modules. 

The following hypotheses are proposed: 

1. There will be no significant difference betwee 
adjusted post-treatment performance of trained and un- 
trained teachers on the teacher behaviors covered in the 
U. S. U. Classroom Management Modules. 

2. For teachers trained in the modules there will b 
significant difference in the frequency of pupil work in- 
volvement and deviant behavior in their classes before an 


after training. 


n the 


eno 


Procedures 


Subjects 


Ss in this research consisted of 29 inse 
school teachers employed in the Denver ar 
these teachers were trained in a course in W The 
Classroom Management Protocols were employed- that 
other nine teachers constituted a comparison grouP hese 
did not receive the classroom manageme 
teachers were drawn for the most part from t 
as the experimental group teachers. However, 
not assigned to the two treatments randomly- 


rvice elementary 
ea. Twenty ° 
hich the uU. 9 


he same § cm 
teachers 


Treatments qin a” 
i 
z " olle 
Teachers in the experimental group were mi y he 
z . e 
extension course in classroom management o He iso?" 
e 


University of Colorado and taught by Dr. Jean 


nt training: choo! 


EE <i ST 
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Table 1.—Correlations 
between Teacher and Pupi 
pil Behavior as Re} 
ported by Kounin 


Pupil Behavior 


Recitation S 
eatwork 
Freedom 
tee s Ries adm 
Deviancy Involvement e 
Teacher Behavior ian 
1l. Wi 
thitness .615 .531 307 
2. Transitions .601 489 pa 
E .382 
.421 
3. Group Alerting -603 -442 234 
. -290 
3494 .385 .002 -.035 


4. Learner Accountability 


he U. S. U. Classroom Management Protocol Modules 


Tabl. 
e 2.—Teacher Behaviors Emphasided in t 


TRANSITIONS 


ted from the 


l. Sti 
mulus Boundedness: The teacher is deflec 
ulus that is un- 


mai DS 
rici tat, and reacts to some external stim 
teacher dela Mii i meme vs. Delayed Response: The 
natural beck responding to an unrelated stimulus until a 
Thrust: "The iei in the classroom activity. 
activities in ae bursts in suddenly on the children’s 
of thought w ch a manner as to indicate that her own intent 
of entry’ vé n the only determinant of her timing and point 
formation a imely Interjection: The teacher introduces in- 
students? EE which minimizes interruption to the 
Flip. 3 

Soap The teacher starts a 
activity. a Sena toa close and then returns to the original 
one cl - Smooth Transition: The teacher fully completes 

ity before moving on to the next. 


e 


new activity without bringing 


LEA 
RNER ACCOUNTABILITY 


iu 
her asks questions which focus 


Goal pj, 
Directed Prompts: The teac! 
out his work plans or work 


9n the 
Student’ i 
2. Progress, t’s goal by asking ab 
ork P 
eir Blanda The teacher holds stu 
or tk by having them show work 
3, cor pv ledge. 
eir mee remeng The teacher involves students 
ers by having them respond to another student's 


Tecitati 
ation or work activity. 


dents accountable for 
or demonstrate skills 


in the work of 


Th 


is 
zt P extended over a period of ten weeks and met 
wear ours per week. The course content consisted 
Odu Er the four Classroom Management Protocol 
"n Cone eveloped at Utah State University- Each module 
"i ee with a major concept adapted from 
ne research. In completing each module the teacher 
a. in ae three or four specific behaviors that could be 
der Pts em rik to apply the general concept. These 
(Med in wi specific related behaviors are listed and 
e2: 


GROUP ALERTING 


1. Questioning Technique: The teache: 
n : r frames a question an 
pauses before calling on a reciter (QT+), Rectan paang thi 
reciter and then giving the question (QT-). Et 
2. Recitation Strategy: The teacher calls on reciters at random 


(RS+) rather than calling on them ina predetermined sequence 


(RS). 
3. Alerting Cues: 
be called on (AC). 


WITHITNESS 


The teacher alerts nonperformers that they may 


1. Desist: The teacher demonstrates Withitness by telling students 
to stop the deviant or off-task behavior. In order to be effective, 
ected at the student who initiated the 
st be administered before the deviant 
re serious. It must be timely and 
r on target, it is a 


the desist must be dir 
deviant behavior and mu 


behavior spreads or becomes mo: 
If the desist is not timely o: 


on target (D+). 
negative desist referred to as (D-). 

2. Suggest Alternative Behavior: When deviant behavior occurs, á 
the teacher diverts the disruptive or off-task student by suggest- q 


n an alternative behavior. 
ids direct confrontation with 


3. Concurrent Praise: 

a student who is displaying deviant or off-task behavior by con- 
currently praising the non-deviant or on-task behavior of other 
students. | 
4. Description of. Desirable Behavior: The teacher describes or has 
the off-task student describe the 
student usually exhibits or should exhib 
going deviant or off-task behavior. 


ing that he engage i 
ise: The teacher avoi 


desirable behavior which the 
it in place of the on- 


Modules consists ofa 
and a set of evaluation 
the teachers being 


Each of the U. S. U. Protocol 


Student Guide, a Protocol Film, 
materials. In completing à module, 
trained went through the following steps 

1. Scan the Learning Sequence. This gives the learner 
f what he will do. 
ectives, à description of the con- 
teacher behaviors to be used to 
n some modules, de- 


trasted. 


-step outline 0| 
2. Read the module obj 


cept and the three specific 
apply the concept in à classroom. Ii 
sirable and undesirable behaviors are con 


— ——Á ERR 
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3. Complete the Recognition Practice Lessons. These 
are transcripts made from classroom audiotapes. The 
learner must identify instances when the teacher used the 
behaviors being learned and determine which behavior 
was used. 

4. View the Protocol Film and identify instances when 
the teacher in the film used the behaviors covered in the 
module. This film also provides a model for the learner. 

5. Take a performance test, the Recognition Test, de- 
signed to measure the learner’s ability to recognize class- 
room applications of the teacher behaviors and discrim- 
inate between applications and non-applications. 

6. Plan a brief lesson designed to practice the Classroom 
Management behaviors. The teacher teaches this lesson in 
his or her own class and records it on audiotape. 

7. Replay the lesson with another teacher who is par- 
ticipating in the training, record on a tally sheet use of the 
behaviors practiced, and discuss. 

Teachers in the experimental group were aware of the 
fact that they were involved in a project aimed at eval- 
uating the U. S. U. Classroom Management Modules. The 
investigators emphasized that the study was aimed at 
evaluating the modules and determining how they could 
be improved, and not aimed at evaluating the Ss as teachers. 

The control group teachers received no training in 
classroom management during the period of the study. 
Observations of their classroom management behavior were 
made during the same time that pre- and post-observations 
were being made of the experimental group teachers. The 

control group teachers were aware that they were partic- 
ipating as control group members in a study aimed at 
evaluating a new course of study. Several days prior to the 
observation all teachers were given identical instructions, 
which included a list of the specific teacher behaviors that 


would be observed in their classroom. 


Teacher Observations 

Two observers were trained to collect data on teacher 
performance. Training consisted of studying the protocol 
modules, practicing, and meeting with Dr. Langer to 
clarify the procedure and resolve problems. The observa- 
tion of teacher performance involved tallying the fre- 
quency with which teachers exhibited 19 specific behaviors 
related to classroom management. of these 19 behaviors, 
it was found that the three positive Transition behaviors 
were extremely difficult to obaerye: It es easy for 
the observer to detect a teacher thrust that occurs in the 
classroom situation. However, a situation in ber s 
teacher could use a “thrust” and does not is very di cult 

detect. Therefore, data obtained on the three positive 
muon behaviors was not employed in mà analysis. 
Transition behaviors were foun to occur at 
y in the classrooms in this study. Therefore, 
combined to yield a total neg- 
Withitness behaviors, except 
also occurred with very low 


The negative 
alow frequenc 
these three behaviors were 
ative Transition score. The 
for "positive desists” (D+), 


frequencies during the pre-training observations. Tallies of 
positive and negative “recitation strategy” also occurred at 
avery low level probably because of an unsatisfactory i 
operational definition. Since the observational frequencies 
for most teachers on “recitation strategy " was zero, no 
reliability coefficient was computed. 

In order to compute the seven reliability coefficients 
shown in Table 3, the two observers independently observed 
the same teacher during the same time span on ten occasions 
These observations were all conducted during the prectram- 
ing observational period. The length of these observations 
ranged from 40 to 50 minutes, with a mean observation 
time of 47 minutes. Rank difference correlations were 
computed for each teacher behavior in order to obtain 
inter-observer reliability coefficients. It will be seen in 
Table 3 that these coefficients range from .60 for goal- 
directed prompts to .97 for positive questioning amy 
(QT+). Since the specific behaviors under Transitions an 
Withitness occurred at too low a frequency to produce re 
liable scores, it was decided not to work with subscores 1m 
the analysis but instead to combine the subscores under " 
each of the four major categories and carry out analysis © 
these four major category scores. . b- 

Although the inter-observer reliability coefficients d 
tained on the pre-training teacher observations were sati 


Table 3.—Inter-observer Reliability on the Classroom 
Management Variables* 


et 
SS M^ 


Variable 
1. Transitions (total negative-i.e. sum of 
.84 
SB-, T-, FF-)** 
2. Learner Accountability 
.60 
a. Goal Directed Prompts (GDP) 
s .96 
b. Work Showing (WS) 
.90 
C. Peer Involvement (PI) l 
3. Group Alerting 97 
a. Positive Questioning Technique (QT+) 
b. Positive Recitation Strategy (RS+)** E 
C. Alerting Cues (AC) 
4. Withitness (total positive-i.e. sum of 38 
Dt, SAB, CP AND DDB)** 
re 
n de befo 
* Rho correlations based upon ten psv gutes: eral 
teachers were trained. Mean observation time, ries fot sev" 


: A t 
** Frequencies of subscores very low with zero en 
teachers; no correlations computed on subscores. 
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A s PR were trained to collect the pupil ob- 

E al ea data. During the pre-training observations. 
^ zi > x in dex were vents to observe a total 

essions in the same classrooms. Wh: 

an attempt was made to obtain inter- min 
from these 12 sessions, loce iom 
were encountered. The most serious problem us dhe tn 
ability to devise a system that would keep the obs x 
in phase, i.e., observing the same child during the WES 
eight-second time span. Since a child may be on task 
minute and off task the next, comparisons of pupil P 
havior for the two observers would tend to mne t 
the inter-observer reliability. As a result of this and bi ; 
problems, satisfactory inter-observer reliability estimat * 
on the five pupil behavior variables were unable to p i 


factor j 

a MAD p erie 

Ghomwedenel 3. — to attempt to collect two hours of 

than the 50 ma p aa teacher after training rather 

pre-training ie Sda i observation conducted during the 

observations esie period. The post-training teacher 

and ranged in ler: re out by the same two observers, 

T WM Dip vo from 105-120 minutes. Since the pre- 

M daria a ilities had been satisfactory, joint 

for Mimi cie tl ig same classrooms were not carried out 

observer MIL UNE observations, and inter- 

the "^ id ns were not computed. In order to make 

requencies Mas requencies comparable, all observational 
'5 were converted to a 120-minute base. 


Pupil Observations 


Pre- 3 
Sous OR observations were conducted only in the obtained. 
o abivegs ce qai diee ul task of the popil i Te aee authors are currently carrying out a study 
The bL a easier than the teacher observers’. in which they have managed to solve this problem. In 
tier PL beerver stationed herself at one side of the room the current study, all four pupil observers have observed 
ni Psi as to be able to see the faces of all children in the same classrooms during the same 30-minute periods 
wa he i observer started at one comer of the room on 11 different occasions. In order to keep the observers 
3^ si : child for a period of about eight seconds. in phase, all observers started at a given point in the class- 
Iliade: iis en tallied the child into one of five pupil room and observed the pupils around the classroom in a 
Work; (2) egories; that is, (1) definitely involved in class predetermined order. The head observer held up her pen- 
(4) sail probably involved; (3) definitely off-task; cil while observing the first child, then put the pencil 
y deviant; (5) seriously deviant. The observer down to tally the child's behavior, and raised the pencil 
while observing the second child. Other observers followed 
to stay in phase with the head 


these pencil signals in order 
observer. Inter-observer reliability coefficients in this study 
These data should be considered 


then l 
and ed at the next child for a period of eight seconds 
ntinued this process until she had viewed and tallied 


the behav; 
hep s of each child in the classroom. This observa- B 
the lh resulted in the observer's tallying — a from .9 An mle: 
| obsery r of each child once every four to five minutes. as ony e rough indication of probz le inter-observer re- 
'rver was also instructed to be alert for any in- liability in the current study. 
Pupil performance observations were carried out only 
perimental group teachers. 


Stan 
ces H , x 
of seriously deviant behavior and to record these 
in the classrooms of the ex 
ely for times when the class 


even; 
iio ne a different child at the time. Since 
attentio evant behavior usually attracts a good deal of Data were collected separat f y 
mie n, it was believed that the observers would prob- was involved in recitation and times when the class was in- , 
h ee up nearly all cases of seriously deviant behavior. volved in seat work. During periods when part of the class q 
* other hand, the observer was instructed to record was involved in recitation and part in seat work, the time 
was divided equally between the two categories. 
rage observation time for both pre- 


the 
at four pupil behaviors only for the specific child r 
t ie tehed during the particular eight-second interval. Although the ave i 
Approx; 5 decided that it would be desirable to have and post-observations was 
ue, A 50 observations of each child. Since the of recitation time vs. seat wo 
r would obtain approximately 12 observations on to teacher. On the pre-training ol 
observation time for recitation W: 


Cac 
chi 1 : 
h ild per hour, a total observational period of four ion t 
ervations. Pupil ob- observation time for seat work on the pre- 
n was 96 minutes. For the post-treatmen 


Ours 
Sw i 
servations selected for the pupil obs 
ns were usually collected on four or five different ue AE vn 
inutes 3 


four hours each, the proportion 
rk time differed from teacher 
bservation, the mean pupil 
as 140 minutes. The mean 
training obser- 

t observations, 


vatio! 
these means were 141m 


O€cas: 
Petiogs The length of the individual observational i na 
Situn, tied considerably in order to fit into various minutes for seat work. 
ooms during the In order to make the data comparable from classroom 
to classroom, the actual frequencies of pupil behavior that 
re multiplied by a 


Nat] 

on 1 RES gr ; 

“choot q $ that occurred in different classr 

toom Y> For example, if an observer entered a class- à 
were recorded durit 


hen the class left the 
factor obtained by 
This result gives an es 


ng recitation We 


dividing 140 minutes by the actual 


timate of the fre- 


> Ob: 

a einer 30 minutes and t 

terminat à meeting in the auditorium, the observer would btaine 

Variaj o, the obs i i In spite of this observation time. 

ation ; ervation at 30 minutes. n p h p 

low ue length of individual observation sessions, quency of each pupil behavior tally that would have been 
for each teacher was obtained if the recitation time for all classrooms had been 

140 minutes. Since this was the mean recitation time, 


eve 
É r 

Pason the total observation time 
Y close to four hours. 
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Table 4.—Adjusted Final Means on Classroom Management Variables 


Con. 
Exp. Adj. " 
Adj. final Adj. 
Variable final mean mean F 
1. Stimulus Boundedness (SB-) id «27 NS 
2. Thrust (T-) .04 .24 NS 
3. Flip-Flop (FF-) -05 27 NS 
Neg. Transitions Total .20 .78 NS 
4. Goal Directed Prompts (GDP) 4.63 2.15 NS 
5. Work Showing (WS) 14.27 8.18 NS 
6. Peer Involvement (PI) 3.37 dag7 NS 
Learner Accountability Total 22.27 12.70 .05 
7. Positive Questioning Techinque (QT+) 24.93 20.66 NS 
8. Positive Recitation Strategy (RS*) 1.98 1.35 NS 
9. Alerting Cues (AC) 2.13 .49 NS 
Group Alerting Total 29.04 22.50 NS 
10. Positive Desists (Dt) 4.73 4.25 NS 
1l. Suggest Alternative Behavior (SAB) 2.86 2.85 NS 
12. Concurrent Praise (CP) 2.17 1.78 NS 
13. Describing Desirable Behavior (DDB) 4.23 3.15 NS 
Withitness Total 13.99 12.03 NS 
Composite of 10 favorable 
behaviors 65.30 47.23 NS 
Table 5.—Changes in On-task and Deviant Behavior in Classrooms 
of Teachers Who Completed the U. S. U. Classroom Management Modules 
Recitation (140 minutes) 
Pre-training Post-training 
Pupil Behavior Mean % Mean % t* 
1. Definitely involved in classwork 541.8) 686.2 1.89 
80.3 88.6 
2. Probably involved in classwork 127.1) 52.3 3.47 
3. Definitely off task 114.0 13.6 71.3 8.5 2.24 
4. Mildly deviant behavior 48.4 5.8 22.9 2.4 1.94 
5. Seriously deviant behavior 1.6 CN E mm 2.20 
Seat Work (100 minutes) 
1. Definitely involved in classwork 393.3 514.8 2.48 
> 80.1 82.9 
2. Probably involved in classwork 51.8) 40.0 1:313 
3. Definitely off task 75.7 13.6 71.9 10.7 .26 
4. Mildly deviant behavior 33.4 6.0 40.6 6.0 «59 
5. Seriously deviant behavior 9 p 3d me .04 


*t > 1.74 is significant at .05 level; t > 2.57 significant at .0l level 


using one-tailed test. 
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multiplyi 
the iir pid ien scores by this factor did not change 
For seat wo k th one an a 
PENIS. bero er pupil behavior scores in each class- 
Minnast : ied by a factor obtained by dividing 
won e^ d he actual number of minutes of seat 
saaton ine He Again, since the mean seat work ob- 
multiplying d s very close to 100 minutes, the result of 
pus Retort actor was to provide an estimate of each 
Pl aig koi eti Er would have occurred if 
Yüfmiirertes D in all classrooms had in fact been 


Results 


In 

US. d to determine whether teachers trained in the 
greater beer —— Modules made significantly 
behaviors than ne 13 specific classroom management 
Covariance wa: camre. group teachers, an analysis of 
four Busse carried out on each behavior, on each of the 
up of the wer categories, and on a composite score made 
teachers? prect esirable behaviors. For each analysis the 
variate and reatment performance was used as the co- 
dependent Lact eatment performance was used as the 
Perimental d e. The adjusted final means for the ex- 
will be ead cattel teachers are given in Table 4. It 
expérimental hat all of the adjusted final means favor the 
ences betwee qoe However, for the most part, differ- 
very small (f the experimental and control groups were 
the aliud € were no significant differences between 
Covered in th es means on any of the 13 specific behaviors 
Module scor M our modules. of the four composite 
teachers w. SB. the adjusted final mean score of the trained 
teachers ie significantly higher than the untrained 
Score made y in Learner Accountability. The composite 

he three n a of ten favorable teacher behaviors, omitting 

ig indien egative transition behaviors shows a somewhat 

(iin, ww d for the trained teachers. However, the 
Significant ^ adjusted final means was not statistically 

- In looking over the adjusted final means for 


groups, it is clear that most of the 13 behaviors 
he 120-minute obser- 


Occ. 

urre 

vation, at low frequencies during t 

ìn period. Only “work showing,” “positive question- 
d with moderate 


ss” occurre 
t that two hours may 
of teacher use of the 
in's data on 


t À 
See ee and *withitne 
aog haves, These frequencies sugges 
"s rid cia, a valid indication 
Cacher NISL an It will be recalled that Koun 
replicati havior were collected over an entire sc 
MM ion of the study reported herein is currently 
tional PX and in this new study pre and post-observa- 
entis, (^U are being collected for cach teacher over an 
Ti, ool day. 
faileq er im of the current stu 
Post mean ow significant differences be 
i ont performance of teachers in the ex 
iver tol groups. Therefore, the null hypothesis cannot 
the U. S. U- Class- 
bout significant 


dy, however, generally 
tween the adjusted 


perimental 


d 
s Sod and we must conclude that 
Anges in sae Modules did not bring 2 
€ specific behaviors taught. 


Pupil Behavior 
Data on pupi : 
E mme 
à Bs experimental 
teachers. Pre-training observations we L group 
uary and early February, while ird ped Án late an 
were made in late April and early M: pa mese. 
each of the five categories of ket de e mend of 
after th i or before and 
cae ri dae S MEG ine checked for significance 
s e ‘ 
servations under recitation and rt ert = 
given in Table 5. ions are 
In the recitation condition, it will be noted that the 
number of pupils definitely involved in seat work 
significantly higher on the post-training obs Mp 
while the number of students probably i nd 
‘ aie y involved in class 
work was significantly lower. The lesser use by the ob: 
of the "probably involved" cate, duri ioe 
ve gory during the post- 
training observations could have reflected the greater 
con foram psy hen tt 
will be noted that 80 3% a a a es 
1 80. 4, of the pupils were definitely or 
observation, The. pe f ees v pom 
mn. d of pupils definitely off-task 
dropped significantly between the pre- and post-training 
observations. Only 8.5% of the students were observed as 
being definitely off-task during the post-training observation, 
as compared to 13.6% during the pre-training observation. 
The frequency of mildly deviant behavior on the post-train- 
ing observation dropped to less than half of the pre-train- 
ing frequency. 
It should be remembered that all frequencies given in 
Table 5 represent about 1/30th of the frequencies that 
would have occurred if all pupils had been observed con- 
tinuously at a rate of 6 times per minute over the 140- 
minute observation. If our observational data obtained by 
viewing pupils in sequence is representative of the data 
that would be obtained if they had all been viewed con- 
tinuously, then it can be estimated that the mean number 
of mildly deviant acts that would occur in the typical 
classroom of 30 pupils in an hour would have been 662 
before the teachers were trained and 294 after the teachers 
were trained. Put another way; 


such acts would occur 
about ten times per minute during recitation in the pre- 
training classrooms and five times per minute in 


the post- 
training classrooms. Since most teachers believe that de- 
viant beh > frequently near tl 


avior occurs much more he end of 
the school year, our resu Idly deviant behavior 


lts on mil 
might have been even more striking if the post-observations 
had not occurred in | May. This possibil- 


ate April and early 
ity could, of course, have been checked if pupil per- 
formance data ha 


d been obtai ned on the control group 


probably 


classrooms. 
The freq 


tation also droppe 
ns. Howe 


leviant behavior during 
between the pre- and 
avior occurred at à 


uency of seriously ¢ 
reci d significantly 
post-observatior ver, this beh: 


0 5  — ààà à9àU 
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very low frequency rate in the classrooms studied. The 
der will recall that observers were to record mildly 
Seinn behavior only when it occurred duringthe time 
they.were watching a given child. In other words, ifthe 
observet were watching Child A during a given eight- 
ond interval and noticed mildly deviant behavior on - 
E art of Child:B during this time, the Child B behavior 
pnm n not be recorded. In contrast, since seriously deviant 
behavior is usually intrusive enough to attract the atten- 
tiop in the classroom and since we expected from: 
Kounin's research that it would be a low-frequency be- 
havior, the observers were instructed to record acts of ’ 
seriously deviant behavior even if they were committed 
by a child other than the one the observer was watching 
during a given eight-second interval. Therefore, the fre- 
quencies of seriously deviant behavior would probably riot 
go up substantially even if all children in the classroom 
were continuously observed. 
ww 


d Scat Work 
LÀ 


^ 


Pupil behavior data were recorded Separately for recita- 
' tion and seat work situations. During the four hoürs of 
pupil observation, approximately one hundred minutes of 
this observation took place during seat work activities in 
the average "classroom. It will be noted in Table 5 that the 
number of pupils observed who were, definitely involved 
in seat work increased significantly between the pre- and 
post-training observations. The percentage of observation 
gPthe “definitely” plus “probably involved” categories 
P sl from 80.1% during the pre-observation to 82.9% 
duringyhe post-observation. Frequencies of the other four 
pupil behaviors did not change significantly between the 
pre- and post-observations, although the Percentage of 
definitely off-task behavior was reduced from 13.6% to 
10.7%. Kounin’s research found that the relationship be- 
tween the teacher classroom management behaviors and 
pupil work involvement and devi 
smaller for the seat work condition than during recitation 
Nine of the thirteen specific behaviors covered in the — — 
U. S. U. Classroom Management Protocol Modules are 
aimed at establishing a classroom environment in which 
off-task and disruptive behavior is less likely to occur, 
The other four behaviors (the Withitness behaviors) are 
designed to provide the teachers with a me. 
ing to deviant pupil behavior when it does 
the nine preventative behaviors can be use, 
effectively by the teacher during recitation, itis apay, 
prising that most of the significant changes in punt pa 
havior occurred during the recitation Situation, 


ant behavior was generally 


‘ans of Tespond- 
occur, Since 
d much more 


Discussion 

It could be hypothesized that the Significant improve- 
ments in pupil behavior obtained during recitation were 
due to a change in the rater’s Interpretation of the Pupil 
behavior categories. Such subtle changes in observer frame 


I — 


` 
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of reference can occur in observational studies particularly 
when two sets of observations are separated by a period ol 
time. The pupil observers did receive refresher training, P 
ever, in order to reduce the likelihood of this happening. 
Also, if the changes obtained during recitation were in 
fact due to changes in observer frame of reference, then 
changes of about the same magnitude should have occurred 
during the observation of pupils doing seat work. Sinee . 
such changes did not occur in four of the five pupil be- 
havior categories, it seems safe to conclude that the 
recitation data probably represent real changes in pupil 
behavior. x k -— 
In this study the link that was found between the U.S.U. 
Classroom Mangement Modules and changes in. pupil 
behavior would have been much stronger if significant 
changes in teacher performance had also been obtained. 
The failure to obtain Significant changes in teacher 
performance could have been due to any combination of * 
several factors. One possibility, of course, is that the 
modules simply failed to change the behavior of the 
teachers, However, since the authors have used a similar 
instructional model to change specific teacher behaviors 
in previous studies and since these studies have usually 
Produced large changes between pre- and post-teacher 
behavior, this interpretation might not be correct (1, 2, 3; 4): 
Another Possibility is that since most of the classroom 


pre and two hours post was in- 
présentative sample of the teach- 
ehaviors. It is also possible that 
d during the post-observations, 
ng as the pre-observations, which 
the observers? being less alert to 


teacher behaviors during the second hour of observation. 


FOOTNOTE 


l. The U. S, U, Classroom Management Protocol Modules 
may be purchased i 


Dissemination Center, Division of Educational Resources, 
niversity of South Florida, Tampa 33620. 
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THE STABILITY OF THREE INDICES OF RELATIVE 
VARIABLE CONTRIBUTION IN 


DISCRIMINANT ANALYSIS 


CARL J. HUBERTY 
University of Georgia 


; ABSTRACT » 
f relative predictor variable conttibutiod: (1) the scaled weights 


An ‘empirical comparison is made of three proposed indices ol 
of the first discriminant function; (2) the total group estimates o 
function; and (3) the within-groups estimates of the correlations 
found that given a single run of an experiment, none 


variables except possibly when fl 


AS DESCRIBED BY Cooley and Lohnes (4, Chp. 9), 
multiple group discrirninant analysis strategy begins with 
a principal axis afalysis. This analysis is made, not of the ! 
ntercorrelation matrix, but-of the matrix 
product E^! H, where E and H are the (p X p) pooled within- 
groups and the between-groups deviation score cross- 
products matrices, respectively, and p is the number of pre- 
dictor variables. This “factoring” may be construed as a 


f the discriminatory power of the set 


artitioning o 
à : rrelated components, called 


of predictor variables iito unco OY cits 
discriminant functions. The vectors obtained fro: 


i i a discriminant s uch that 
eigenanalysis of E'H define a discriminant space such | 
s the group are located within it, 


when points representing m 
these canem Fog from each other to a maximum 
degree. 

Sample estimates of the weigl 
function are determined by the (p X 1) 
associated with the eigenvalue Ay, from the de 


equation 


hts of the ith discriminant 
eigenvector b;, 
terminantal 


|E!H —N|=0- 


The equation which leads to the weights is 
(E!H — X) b; = 0- 

which is obtained as a result of maximiz 0 0 

mean square between-groups to the mean square within- 


groups, the mean squares being based on the discriminant 
function values. The maximum number of discriminant 


ing the ratio of the 


f the correlations between each predictor variable and the first 
between each predictor variable and the,first function. It was 
of the indices was sufficiently reliable in identifying the rank-order of the 


he total sample size was very large. 


the smaller of p and k — 1, where k is the number of 
criterion groups. 

To find the dimensionality 6f the so-called “discriminant 

' space,” either the eigenvalues are subjected to a significance 
test (10:372-373), or a subset of the non-zero eigenvalues 
that accounts for a large percent, say 80%, of the total 
discriminating power of the predictor variables may be 
chosen. By analyzing the k group centroids in the discrim- 
inant space, it is possible to determine the role of each of 
the discriminant functions retained. That is, some insight 
into the question, “Between what groups or sets of groups 
does each function discriminate? " may be gained, and it 
is often useful to determine which predictor variables are 
contributing the most and the least to such diseriminations. 
The problem thus arises of defining suitable indices of pre- 
dictor variable potency, in terms of relative variable con- 
tribution to discrimination. 

One index that has been proposed by several writers, 
e.g., Tatsuoka (12), is based on the sample “beta” 
weights, that is, weights applicable to standardized pre- 
dictor values. The standardized weights for the ith eigen- 
vector are determined by multiplying each element of the 
original vector by the positive square root of the variance 
of the ith variable: 

H1 


" 
b; =aDb; 


where a is a scalar and D is the (p X p) diagonal matrix of 
the positive square roots of the principal diagonal elements 


functions necessary to represent the group differences is 


60 JOURNAL OF EXPERIMENTAL EDUCATION 


of E. (The a-value in Eq. [1] indicates that the eigenvectors 
are only unique up to a constant of proportionality.) It is 
argued that these weights may be used to assess the relative 
contribution of the predictors in determining the ith dis- 
criminant scores. 

Another approach to the problem of assessing relative 
predictor variable contribution to discrimination involves 
estimates of the correlations between each of the predictors 
and each of the discriminant functions. Two estimates of 
these correlations have been used. When the data collected 
are considered representative of a single population, these 
“structure” correlations are based on the "total group" 
predictor intercorrelation matrix. The ith (p X 1) vector 
of these correlations is given by 


r;= D RD,b; [2] 
where 


D, = (p X p) scalar matrix of the reciprocal of 
the standard deviation of the scores on thc ith 
discriminant function; 

R = (p X p) “total” intercorrelation matrix of 
the p predictors; 

D, 7 (p X p) diagonal matrix of "total" standard 
deviations of the p variables; and 

b; = (p X 1) vector of weights for the ith discrim- 
inant function. 


Correlations computed this way are precisely the Pearson 
product-moment correlation coefficients between the sam- 
ple predictor scores and the sample discriminant scores on 
the ith function (5:339). 

If the underlying model is one of k populations with 
identical covariance matrices, then the maximum likelihood 
estimate of the true ith correlation vector is given by the 


(p X 1) vector [(2:53) or (9:225)], 


if = D (Eb) (b; Eb)” [3] 


where D is defined as in Eq. [1] and b; is defined as in Eq. 


[2]. 

The purpose of this study was to investi 
over repeated sampling, of three indices of 
dictor variable potency: 


gate the stability, 
relative pre- 


1. the scaled weights as given by Eq. [1]; 
2. the correlations as determined by Eq. [2]; and 
3. the correlations as determined by Eq. [3]. 


Only the vector of scaled weights associated with the first 
discriminant function was considered in the present study, 
Reasons for this restriction are that the first function dE 
ually accounts for a large portion of the discriminatory 
power of the set of predictors, and that for each replication 


of the experiment (to be discussed in the next section), the 
number of "significant" functions may not be the same, 


although there will always be at least one. [See also Barg- 
mann (1).] 


Simulation Procedure 


To effect the simulation of drawing random samples of 
size N from k p-variate normal populations with a known 
common covariance matrix, a high-speed electronic com- 
puter—IBM system 360, Model 65—was used. In this study 
the number of predictor variables considered was p=10, 
and the numbers of criterion groups were k = 3, and i 
k=5. 

Standard normal scores were considered in the simulated 
sampling. In determining the common population co- 
variance matrix È, the goal was to get covariances (i. e., 
correlations) that are typical of those found in applications. 

[Cochran (3) implies that in practice most correlations are 
positive and modest in size.] The classical factor analysis 
model (6:15) was considered in arriving at J: 


= 
= 
& 
S 
[ 


m 
LU 


pop ^ (10X m) matrix of coefficients of m common 
factors (i. e., matrix of factor loadings), and 


c 
" 


pop ^ (10X 10) diagonal matrix of coefficients of 
the unique factors. 


The communality of each of the predictors was arbitrarily 
set at .75, thus makin 


g the reliability of each predictor at 
least .75. This conditi 


on yields a D „p matrix with all diag- 
onal elements equal to .50. "S 


Separation between 
by prescribing a 
and then obtaini 


Rao (11:488)]! 


the k populations was accomplished 
(10 X k) population weight matrix Woop , 
ng the population mean matrix [see | 


Mop = X B un [4] 7 
(The use of the different Fisher and Rao discriminant 
analysis models is recognized—the relationship in Eq. [4] a 
was merely used to get the desired separation. Even though 
the weights in the two models are not directly related, ex- 
cept in the two-group case, use of population counterparts 
of the Rao weights was considered appropriate in this study- 
Total sample sizes considered, across all k groups, were 
N - 90, 150, 300, and 450. Equal-group sample sizes Ng E 
NIk were used. Corresponding to each N, sample score : 
matrices of size (10 X N) were generated from each of = ( 
k p-variate normal populations having the common co- 
variance matrix $. To generate these sample score matins, 
à procedure similar to that suggested by Kaiser and Dickm2 


vae 


| 
| 
, 
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(7) was employed. A number was selected from a uniform 
(0, 1) distribution using a subroutine called RANDU, cor- 

responding to which a number from a normal (0, 1) “con- 

tinuous” distribution was located. This technique was used 
to produce the elements of both an (m X N,) matrix F 


and a (10 X N,) matrix U. The subsample score matrix 
corresponding to group g was then obtained using 


x, = AF + DU + M, 
where 
g ^ the (10 X N,) matrix of "observed" scores, and 


= the (10 X V,) matrix, the ith row of which con- 
tains the (constant) value of element (i, g) in 


Mpop 
Thus, in essence, random samples were selected on the 
orthogonal F and U matrices, and the observed scores were 
obtained by the above transformation. This sampling ex- 
periment was repeated 100 times for each N-value to pro- 
vide data for empirically checking both the reliability (in 
the sense of consistency) and the validity of the three indices 


under consideration. 


Data Analysis 

For each replication of the experiment the 10 predictor 
variables were rank-ordered according to the absolute value 
of each index. The criterion used to judge the stability of 
the three indices of relative predictor variable potency was 
the consistency of the observed rank of each variable over 
repeated replications of the experiment. A necessary but 
not sufficient essential for a valid index of m beum 
is that it exhibits consistency over repeated —€— 0 
is, an index lacking such consistency provides no us A 
inferential statements eap E mi, et - : o 

i ari . On 

E 15 dit dieta function were 
vari 


determined. . 
These potency rankings were analyzed in two ways. 

"i h E ber of times each variable attained a given 

Vi ipao oim of-times-per-rank 


rank was determined. These number- d 
counts were found for each of the four ae 
of the two k-values studied. These counts were organize 


into 24 (3 indices X 4 values ofNX2 ee 
way contingency tables, the rows corresponding 

i n 
possible ranks and the columns corresponding to the te 


variables. " I 
i ; approach. 
The second analysis involved a correlational app 


For each index the ranks of the variables, hen hi cie tü 
the first discriminant function, were vn is ss ep 
lication of the experiment. Ranks from E 
assigned according to the numerical value of the index. 


lationship among the 100 rankings was determined by com- 
puting the coefficient of concordance W (8:95). This 
coefficient was computed for the first discriminant function, 
for each of the three indices, and for each of the four values 
of IN in both a three- and five-group situation. This resulted 
in the computation of 24 coefficients in all. The significance 
of each observed value of W, i. e., the hypothesis that 

there was rio consistency in the rankings over the 100 rep- 
lications, was tested using a chi-square statistic (8:98). 
When an observed value of W was found to be significant, 

i. e., when there was evidence of some agreement of the 
potency rank-orderings of the discriminatory variables over 
repeated sampling, an estimate of the true ranking was 
obtained by ranking the variables according to the sums of 
the ranks allotted over the 100 replications. Kendall (8:114) 
has shown that this procedure gives a “best” estimate in a 
least-squares sense. 


Results 
Three Group Case (k = 3) 


It was clear from tables exhibiting the number of times 
each variable attained a given rank for each index that the 
stability of the indices over repeated sampling is not very 
marked. If an index is operating consistently over repeated 
sampling, then each column of such a table would contain 
only one value which is large in relation to the others; such 
a pattern was not observed. 

The W-values and the observed chi-square values corres- 
ponding to them are given in Table 1. All of the values 
were significantly different from zero (at the .01 level). 


Table 1.—Coefficients of Concordance, W, and Associated Chi- 
Square Values for k = 3 


2 


Indices » 2 
Index 1 
Sedo 3 112.050 
N = 150 A7 159.007 
N = 300 .302 272.007 
N = 450 .381 343.043 
Index 2 
N30 +182 163.887 
ipsis .259 233.267 
sa .288 258.923 
N= 450 .418 376.139 
Index 3 
N90 .189 169.979 
Re .265 238.097 
Nomen .299 268.898 
3150 1425 382.894 


x = 21.666 at 1% level. 


Thus, for each index, a 100-by-10 two-way table was 
" formed for each value of N and each value of k. The re- 
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Table 2.—Estimates of the True Rankings of the Predictor 
Variables for k = 3 


Variable 

Indices 1 2 3 4 5 6 v 8 9 an 
Index 1 

N = 90 8 9 2 4 6 5 3 1 10 F 

N = 150 6 7 2 5 8 4 3 1 10 9 

N = 300 8 7 3 4 5 5 2 1 10 9 

N = 450 e 7 x Wok S 5 i sog 

Final 7.5 7.5 2 4 6 5 3 1 10 9 
Index 2 

N= 90 10 2 7 3 8 6 4 1 9 5 

N = 150 10 2 4 7 8 6 3 4 9 5 

N = 300 10 2 5 4 8 7 3 x 9 6 

N = 450 10 2 5 7 6 8 3 1 9 4 

Final 10 2 55 5.5 8 7 3 1 9 4 
Index 3 

N= 90 10 2 7 3 8 6 4 1 9 5 

N = 150 10 2 4 7 8 6 3 1 9 5 

N = 300 10 2 5 4 8 6 3 1 9 7 

N = 450 10 2 5 yi 6 8 3 1 9 4 

Final 10 2 5 5 8 7 3 1 y 5 


) Some (when N = 90) of the observed W-values which, 
according to Kendall, may be interpreted as correlation 
(here in the sense of reliability) coefficients, 
That these low values wre significant is simpl 
the power of the test to detect differences between the 
hypothetical zero-value of the population W-values and 
their observed values, which differences are of no practical 
consequence. [t is clear that unless sample size is very 


large, neither the scaled weights nor either of the correla- 
tion estimates are very consistent over repeated sampling. 
The population weights for Variables 9 and 10 were 
fixed at zero. That is, these two variables would be ex- 
pected to exhibit minimum potency insofar as their con- 
tribution to discrimination among groups is concerned. 
Hence, it was possible to effect, to some extent, an eval- 


uation of the validity of the three indices under consid- 
eration. 


are quite low. 
y a result of 


While none of the indices provides a very reliable rank- 
order of variable potency for a single run of the experime 
the reliability of each index is nevertheless sufficient to 
provide a reliable (in the sense of consistent) estimate of 
variable potency when the ranks are averaged over 100 rep- 
lications of the experiment. Table 2 gives the potency of 
each variable based on the average value of its rank as as- 
signed by each index over 100 runs of the experiment. 
With one exception, which occurred in the case of the 

smallest sample size (N = 90), Index 1 assigned potency 
ranks of least and next-to-least to Variables 9 and 10, 
Index 2, on the other hand, assigned potency ranks rang- 
ing from 4 to 6 to Variable 10, and the ranks assigned to 
this variable by Index 3 ranged from 4 to 7. Judged in the 


nt, 


light of this criterion, Index 1 is clearly the most valid of 
the three. 

As a check on the reliability of the average potency 
ranks over 100 replications of the experiment, Kendall's 
W was calculated for each index using the (average) ranks 
for the four sample sizes as given in Table 2. The W-values 
for the three indices were .95, .92, and .91, respectively. 
When the sum of the (average) ranks over the four sample 
sizes is used as a basis for assigning an overall potency rank 
to each variable, these “final” ranks are as shown in Table 
2. On the basis of these final ranks, Index 1 again identified 
Variables 9 and 10 as least potent. However, one of these 
variables, 10, was assigned a rank of 4 by Index 2 and a 
rank of 5 by Index 3. 


Five-Group Case (k = 5) 


The results obtained in this case very closely parallel 
those obtained when the number of criterion groups was 
three. Values of Kendall’s W were computed, as well as the 
chi-square values used in testing the significance of each. 
The results are reported in Table 3. The average potency 
ranks of each variable as asssigned by each index over 100 
runs of the experiment are given in Table 4. Again, 

Index 1 assigned the lowest ranks to Variables 9 and 10. f 
Index 2 and Index 3 performed somewhat better than in the 
three-group case, but they still failed to consistently 

identify Variable 9 as one of the two variables of lowest 


Table 3.—Coefficients of Concordance, W, and Associated Chi- 
Square Values for k = 5 


N = 90 069 62.433 
N = 150 143 128.998 
N = 300 236 212.140 
N = 450 351 315.506 

Index 2 
N = 90 099 88.798 
N = 150 .121 108.960 
N = 300 191 172.189 
N - 450 .318 286.259 

Index 3 
N = 90 12 100.954 
N = 150 .129 116.149 
N = 300 .208 186.446 
N = 450 332 298.848 


x = 21.666 at 1% level. 
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potency. The value of Kendall's W for the (average) ranks as 


assigned by Index 1 over the four sample sizes was .85. The 
corresponding values for Index 2 and Index 3 were .95 and 
-97, respectively. Overall or “final” ranks were established, 
and are also given in Table 4. Again, Index 1 identified 
Variables 9 and 10 as least potent. Index 2 and Index 3 
identified Variable 10 as least potent, but assigned final 
potency ranks of 7.5 and 8, respectively, to Variable 9. 

It might have been well to relate the results obtained 
to the population (or true) character of the variables with 
regard to relative contribution. (It may be noted that in 
the true sense there is no agreed upon index of variable 
contribution.) No consistent relationships between the 
population weights ( in W op) and the rank-orderings 
reported in Tables 2 and 4 resulted. In the three-group 
case, the best variable (Variable 8) had a large population 
weight for one group and two zero weights. The worst 
variable, as determined by function-variable correlations, 
was Variable 1 which had one large population weight and 
two moderate weights. Whereas, in the five-group case, 
the best variable (Variable 2), according to the correlation 
index, had all non-zero population weights. A low ranking 
variable (Variable 5) had two zero weights, two moderate 
weights, and one substantial weight. 


Table 4.—Estimates of the True Rankings of the Predictor Variables 
fork=5 -— 


Tm Variable 
1 2 3 4 5 6 7 8 9 10 
Index 1 q o4 odo SOP Aox m 
: z E 6 3 4 8 5 7 2 1 1 9 
dac à 1 5 8 7 6 3 2 $9 390 
s 2 5 8 ? & X X d0 9 
Now: 490 à 5$ 5$ 9$ 9 7 Z& i & 
Final 
Index 2 o oy 9$ & 4 x 8 5-4 
N = 90 saws * 4 2 9 y u» 
N = 150 ce 6 $ $ S ow Y E 
N = 300 à 2 6 2 59 59 2 ER a 
N = 450 "S 4 6 5 9 3.5 2 3.5.2.5 10 
Final 3 
Index 3 s 2 Pk om Wok As d 
N= ad ; 4 6 5 » à 2 Se a 
Neu 2 m ido BPS 3 NUS d 
N = 300 a X» 5 92.3 m WA» 45 
N= 430 i € 5 355. 28S 8 p 


Discussion and Conclusion 


As in multiple regression analysis, the notion of variable 
contribution in discriminant analysis is an evasive one. In 
both analyses the variables act in concert and cannot log- 
ically be separated to determine how much each variable 
contributes to prediction or to a discriminant function. 
Thus, an index of absolute contribution is, considering 
the present state of knowledge, out of the question. An 
index of relative variable contribution is, however, an 
approach advanced by many researchers. Traditionally, 
the index used to gauge the contribution of each variable 
in the company of all others is the standardized discrim- 
inant weights of Eq. [1]. Some writers have proposed 
the correlations of Eq. [2] (4) and the correlations of 
Eq. [3] (2) for that purpose. Other writers, e.g., Tatsuoka 
(12), state that such correlations are not intended as meas- 
ures of potency of discrimination, but as aids in "inter- 
preting” resulting discriminant functions. 

Both types of indices have been advanced for use as 
descriptive indicators of relative variable contribution. 
Some researchers have implied in discussing discriminant 
analyses, however, that their results can be expected to be 
found with other samples of subjects. That is, it is implied 
that the index of relative potency will rank-order the 
variables similarly across repeated sampling; further, it is 
assumed that if a variable is judged, on the basis of the 
descriptive index, to be a nondiscriminator, then a similar 
judgment would be made if data are collected on new sub- 
jects. 

This study dealt with the reliability and, to some extent, 
the validity of the three proposed indices of relative variable 
contribution in discriminant analysis. Conclusions are 
limited to a situation in which (1) the k populations are 
10-variate normal; (2) the k population covariance matrices 
are identical; (3) the number of "subjects" drawn from each 
population is the same; (4) only the first discriminant 
function is evaluated; and (5) elements of the common pop- 
ulation covariance matrix and the differences between mean 
vectors are similar in magnitude to those used. In this sit- 
uation the findings support the following conclusions: 

l. Indices 2 and 3 can be expected to have very com- 
parable reliability in assessing the relative potency of 
predictor variables; when the number of criterion groups is 
three, this reliability is slightly higher than that of Index 1, 
and vice versa for five criterion groups. 

2. Index 1 can be expected to be the most valid in 
identifying those variables that contribute minimally to the 
discrimination involved. 

3. Givena single run of the experiment, none of the in- 
dices can be expected to be sufficiently reliable to be of 
great practical value in identifying potent variables unless 
the total sample size is very large. The lack of reliability 
of the discriminant function-variable correlations found in 
this study contradicts a conclusion reached by Thorndike 
and Weiss (13), an investigation that involved two sets of 
real data. 
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FOOTNOTE 


1. The Apop " 2 Woop „and Mop matrices are available 


from the author, Dr. Carl J. Huberty, College of Education, The 
University of Georgia, Athens, Georgia 30602. 
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EVALUATION OF TEACHER EFFECTIVENESS’ 
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ONE OF THE MOST important but controversia] sub. 
jects in academia today is evaluation of faculty teaching 
effectiveness. Although instructor self-improvement is or 
should be the primary purpose of these evaluations, sur- 
pluses of faculty candidates and tight budgets have pres- 
sured administrators to seek some quantitative (seemingly 
objective) index by which faculty teaching effectiveness 
can be measured for decisions on individual retention, 
salary increases, promotion, and tenure. Probably, the 
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own) in student grade expectations. 
Ximately the same as the negative 


most widespread method, since its introduction in the 
early 1900s, has been faculty rating questionnaires com- 


pleted by students. But, just how valid are rating ques- 
tionnaires? The objectives of this paper are to briefly 
critique faculty rating questionnaires, review some of the 
latest findings on their use, and present the results of an 
empirical study testing the theory of cognitive consis- 
tency as a predictor of studen 


t evaluations of college 
teachers. 


» 
ate Analysis in Educational Research, 
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The Problem 
Faculty Rating Questionnaires 


Many different types of rating forms have been util- 
ized by colleges and universities, but frequently the ratings 
have provided little help to the instructor in improving his 
teaching skills. Questionnaires sometimes simply ask stu- 
dents to rate the instructor on some dimension (suppos- 
edly related to effective teaching), such as "accessibility 
for individual conferences," using a scale ranging from ex- 
cellent to poor. But, compared to what? Many question- 
naires ignore this critical scaling question, while others 
ask for comparisons with the "best," "average," or 
"worst" teachers. Such shifting or sliding reference points 
(which depend upon the student's personal sample of à 
teachers) make comparison of student evaluations invalid. 
Use of the amorphous, unattainable concept of the ideal 
teacher as a reference point is probably little better since 
it too will vary widely among students. A few rating forms 
use a forced-choice format with specific descriptive 
phrases which serve as brief critiques of instructor pe 
formance and help reduce rater bias, especially the halo 
effect where the student rates the instructor alike on dif- 
ferent teaching dimensions because of his overall attitude 
toward the instructor. 

Oftentimes, it is not clear to the student raters just 
what is being measured (instructor characteristics, course 
content or material, methodology, changes in student 
knowledge, or personal objectives achieved). In addition, 
there is no reliable evidence that those qualities listed on 
a scale are the ones that advance or impede student at- 
tainment of specific educational objectives (5). Seldom 
is the amount of student learning taking place related to 
instructor effectiveness. To date, "teacher effectiveness 
has not been adequately defined, and many urn mi 
persist among xen prie and administrators abou 

ective teacher. . 
oem s research literature on teaching evaluation 
i d. Descriptions of teaching 
reveals few areas of solid groun à puonsien 4 
ry according to who is doing the 
evaluation (4). For example, Wedeen (17) found that two 


ht concurrently by the 
i same course taug! I 
in te entical assignments and examin- 


effectiveness tend to va 


same instructor with id d 
ations can be perceived differently by different groups of 


students. In a recent study (1), it was ceed ee adn 
evaluations of instructors may even decline with i Fa 
ing age differentials between students " pistas 
In general, many rating forms have hem ae validit 
reliable (i. e., repeatable), but the question o d (5). 

(i. e., what is being measured) remains unresoly . 


Static vs. Dynamic Measurement 

One problem with many of the studies on instructor 
evaluation has been the static as opposed to dynamic nature 
S Hiit approaches. Albeit student expectations have been 
recognized as an important influence on the evaluation 


B 


process (3, 8), the impact of "changing" student expec- 
tations has been largely ignored. Even the vital question re- 
garding the effect of expected grades on student evaluations 
of instructor effectiveness has been dealt with largely from 
a static framework. Vocks and French (15) investigated 

the proposition of whether or not students are influenced 
by grades when they rate quality of teaching. Their finding 
was that grades and student ratings of instructors had no 
reliable relationship. More recently, Krull and Crooch 
(10:9) report that “most studies indicate that the relation- 
ship between the grade expected in a course and a student's 
evaluation of the instructor's teaching effectiveness show 
little positive correlation." 

Indeed, it may well be that there is no significant re- 
lationship between the final grade a student receives, or 
expects to receive, and his evaluations of the instructor and 
course. That is, two individuals receiving or expecting 
final grades of C may rate the instructor quite differently; 
so might two students expecting final grades of A. Sim- 
ilarly, a student expecting a final grade of B may rate the 
instructor identically with someone expecting a final grade 
of D. The unanswered question in previous studies is: 

What is the effect of disconfirmed (or changed) grade ex- 
pectations on student evaluations of instructors and courses? 
To illustrate, will there be a difference in the ratings by 
students who expected an A at the beginning of the course, 
but whose expectations have lowered to C near the end of 
the course? Conversely, what is the effect on ratings by a 
student who originally expected a C but whose expectations 
at the end of the course rose to A? It is this dynamic view 
of grade expectations which may be more valuable in 
accounting for student differences in perceptions of teacher 
effectiveness than the typical static approach. 


Expectations 

Expectations may be described as subjective notions of 
things to come (9). In terms of student relationships with 
instructors, an expectancy may be thought of as an initial 
hypothesis formed by the student, and his perception of 
the outcome after completing the course of instruction 
will serve to either confirm or reject the original hypothesis 
(6). Grade expectations are confirmed when the student 
receives the grade he originally expected. Negative discon- 
firmation results when the grade outcome is lower than 
prior student expectations. Positive disconfirmation occurs 
when the grade actually exceeds earlier expectations. 

When expectations are realized, student evaluations of 
the instructor and course should coincide with prior ex- 
pectations and ratings, but what are the effects on eval- 
uations when the student’s expectations for a grade have 
been disconfirmed, either negatively or positively? 


Disconfirmed Expectancies and the Theory of Cognitive 
Consistency 
In determining the impact of disconfirmed expectations, 
the psychological theory of cognitive consistency deserves 
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major consideration. Stated briefly, cognition refers to the 
constellation of knowledges, beliefs, attitudes, and per- 
ceptions that an individual has concerning himself, his 
behavior, and his milieus (13). From the individual $ cog- 
nitions, a conceptual framework is developed which guides 
his behavior (11). At the heart of all cognition-related 
theories of human behavior is the premise that individuals 
seek consistency among cognitions. That is, they strive for 
compatibility among their knowledges, beliefs, values, and 
perceptions of themselves, and elements or persons in their 
physical or psychological frame of reference. Inconsistent 
cognitions create psychological stress or tension which, for 
relief, compels behavior directed toward the attainment 
of consistency. Osgood and Tannenbaum's congruity prin- 
ciple (12) explains this process more precisely. It implies 
that when two or more objects (a communication source 
and a goal object) are associated or linked together by an 
assertion, there is a tendency for the evaluation of one or 
both objects to change so that the two evaluations become 
more alike. The principle further stresses that changes in 
evaluation will always occur in the direction of increased 
congruence within the individual's frame of reference. 
Students continuously receive various kinds of feedback 
from their own experiences, peers, instructors, and class- 
room performances. These information inputs are cogni- 
tions which students like to keep consistent with one 
another. When a student receives two pieces of information 
which are psychologically dissonant, he attempts to reduce 
this mental discomfort by changing or distorting one or 
both of the cognitions to make them more consonant or 
compatible. The more powerful the cognitive dissonance, 
the more inclined he is to attempt to reduce dissonance by 
changing the cognitive elements (2). Accepting the premise 
of cognitive consistency , pre-course expectations of students 
regarding their individual performance and the quality of 
instruction would tend to coincide since the students are 
free to adjust either expectation to achieve consonance. 
However, any discrepancy between stud 
tions and ratings at the beginning of the course compared 
to the grade expectations at the end of the course will 
likely be resolved by the student’s adjusting his perceptions 
of the course and the instructor so that evaluations become 
more consistent (less dissonant) with his final expectations. 
To test the applicability of cognitive consistency theory 
to the teacher evaluation process, the following hypotheses 
were considered when student grade expectations are dis- 
confirmed: 


ent grade expecta- 


1. Null Hypothesis: Instructor and course evaluations 
by students before and after disconfirmed grade expecta- 
tions are not significantly different. 

2. Research Hypothesis: Instructor and course eval- 
uations by students tend to vary directly, after disconfirmed 

ade expectations, with the directional change (up or 
bed in student grade expectations. 


Method 


To test the hypotheses, 140 undergraduate and graduate 
business students were asked to record their expectations 
on an instructor and course evaluation questionnaire at the 
start of five separate courses in marketing and finance. Then, 
during the last week of classes, the students recorded their 
final expected grades and instructor/course evaluations on 
the same questionnaires, illustrated in Table 1. Complete 
student anonymity was assured by allowing students to 
select their own codes for the two questionnaires for later 
matching of the initial and final ratings. For several prid 
(dropouts, unmatched questionnaire codes, or absenteeism 
on the day of evaluations), the final study was reduced to 
108 students. 

The students were divided into three groups: (1) those 
whose grade expectations remained the same in both sur- 
veys; (2) those whose expectations fell from the first to 
the second survey; and (3) those whose grade expectations | 
rose. This taxonomy placed 61 students in the “no-change 
group, 33 in the “downward-change” category, and 14 in 
the “upward-change” group. Mean responses by students 
in each of the three grade expectations categories were 
compared on the sixteen questionnaire variables by uni- 
variate F-tests to determine significant differences.2 For 


decision purposes in this study, a significance level of 0 
or higher was chosen. 


Results 


'To determine what effect, if any, disconfirmed grade 
expectations may have on student ratings of instructors 
and courses, the first and second evaluations by students 
in each separate group were compared. 


No-Change Category 


As shown in Table 2, there was a significant difference i 
(05 level or beyond) in ratings between the first and € 
evaluations on eight (Nos. 1, 2, 5, 7, 10, 11, 12, 16) of ss 
Sixteen variables concerning student perceptions of the " 
Structor and course. Four of the rating variables rose an i 
four fell from the first to the second evaluations. The t-tes 
comparing the composite mean scores of the first and t 
second evaluations yielded a value of 7135, which is no 
significant. Thus, it appears that there is no systematic ia 
change pattern in student perceptions when grade expec 
tions remain constant during a course of instruction. 


Downward-Change Category 


Nine rating variables (Nos. 1, 2, 4, 7, 8, 9, 12, 13, HN 
can be seen in Table 3, were significantly different betwe 
the first and second surveys in the group whose grade - 
expectations changed downward during the term. All ne 
of these variables fell, i. e., they moved in the direction a 
the change in student grade expectations. It is also inter 
esting to note that 15 of the 16 variables changed bett 
ward in the direction of the negatively disconfirmed gra 


Table 1.—Instructor and Course Evaluation Forms 
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VARIABLES 


THE INSTRUCTOR: 


2 
curiosity 


3) Encouraged independent chinking 


i Was well prepared for class 
Stimulated intellectual 


4) Was concerned that the class 


understood him 


5) Showed respect for the questions 


and opinions of students 


6) Was accessible for individual 


conferences 


7) Did the tests cover what you 


could reasonably be expected to 


know? 


8) Were the lectures relevant to 


the course? 


9) Were required assignments appro- 


priate to the course? 


10) Was discussion used or allowed 
when appropriate to the course? 


11) Did the instructor have a 
pleasant personality? 


12) Rate this course as a beneficial 


educational experience 


13) Rate the quality of the instruc- 


tor's presentation 


14) Rate the overall teaching abil- 


ity of this instructor 


15) How was the instructor of this 


course recommended by other 


students? 


16) How fair (objective) do you think 
the instructor of this course 


will be in grading you? 


Table 2.-No Change in Grade Expectations Mean Ratings 


First Evaluation 


Second Evaluation 


Variables 
| E € 
| 1. PREPARATION 4.5738 4.22958 4 
2. STIMULATION 4.3279 3.9836? + 
| 3. THINKING 4.2787 4.1957 
4. COMMUNICATED 4.5574 4.5082 
5. RESPECT 4.4918 4.73778 + 
M 6. ACCESSIBLE 4.1803 4.0984 
{ 7. TESTS 4.4754 4.0164? + 
| 8. ASSIGNMENTS 4.4754 4.3115 
[ 9. LECTURES 4.4426 4.2295 
10. DISCUSSION 4.4098 4.62209 + 
ll. PERSONALITY 4.3115 4.8351? + 
12. EDUCATIONAL 4.3279 3.83618 + 
13. PRESENTATION 4.2295 4.0000 
| M. OVERALL 4.2459 4.1475 
| 15. RECOMMENDED 3.0984 2.7213 
| 16. FAIRNESS 4.2131 4.4758 + 
j E oa 4.2900 4.1844 
4 STANDARD DEVIATIONS .3303 .4682 
| 3p < .01 bp < .05 


^o^ e fF Pk RR oA 


o o oc oo o 


o o 6 o .o-o.c 6 


Second Evaluation 


Variables First Evaluation 

Means. Means 
l. PREPARATION 4.5455 UR 
2. STIMULATION 4.2424 3.33333 + 
3. THINKING 4 .515 3,8788 
4. COMMUNICATED 4.5152 4.06062 + 
5. RESPECT 4.6667 4.5152 
6. ACCESSIBLE 4.4545 4.1515 
7. TESTS 4.6364 3.8788% + 
B. ASSIGNMENTS 4.6970 4, 3333€ + 
9. LECTURES 4.6061 4.27270 4 
10. DISCUSSION 4.3636 4.2707 
11. PERSONALITY 4.3333 4.5152 
12. EDUCATIONAL 4.3939 3.3030? + 
13. PRESENTATION 4.3030 3,5758? + 
14. OVERALL 4.2727 3.6667 + 
15. RECOMMENDED 2.2727 2.0909 
16. FAIRNESS 4.3333 4.0303 
GRAND MEANS 4.2992 3.8807 
STANDARD DEVIATIONS .5466 -5853 


a < .01 


bp < .05 


Cp < .10 
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expectations. This mass movement of student ratings on the 
individual questionnaire items suggests the presence ofa 
negative “halo” effect consistent with the students . 
lowered grade expectations. The t-test score of 2.023 with 
30 degrees of freedom is significant at the .10 level for the 
two-tailed test and just short of the 2.042 needed for 
significance at the .05 level of significance. 


Upward-Change Category 


In Table 4, only four of the rating variables (Nos. 5, 8, 10, 
15) were significantly different between the first and 
second evaluations. But, again, all four of the variables 
changed upward in the direction of the positively dis- 
confirmed grade expectations of the students. This time it 
seems a positive “halo” effect was operating, as 14 of the 
16 rating variables changed upward (or remained the same 
in two cases) in the direction of the revised student grade 
expectations. Only two of the sixteen variables moved in 
the opposite direction. The t-value of 3.299 with 30 de- 
grees of freedom is significant at the .001 level. 


Regression Analysis with Binary Variables 


In order to obtain additional information about the 
precise nature of the relationship between student grade 
expectations and instructor evaluations, regression analysis 
was employed with binary variables as independent variables. 
When using binary variables exclusively as independent 
variables, it can be shown that regression analysis amounts 
to a variation of an analysis of variance test.? 

The dependent variable used in the model represented 
the difference in mean evaluation scores between the first 
and second surveys for each of the 16 questions, for each 
of the three groups. This is denoted as di^yji-Y,p 
where i 7 1...,48. There are thus a total of 
48 observations available for use with the regression equa- 
tion. The first independent variable, X, , is defined to be a 
1 (and 0 otherwise) for each of the 16 values of d; associ- 

ated with the group of students whose grade expectations 
rose. The second independent variable, X, , is defined to be 
a 1 (and 0 otherwise) for each of the 16 values of d, assoc- 
iated with the group of students whose grade expectations 
fell over the semester. It can be shown that the effect of 
defining the regression in this way is to produce the follow- 
ing result: 


Î = bye * Her- Bye) X, + gg - Hye) X, 


where: 
3 = the calculated difference in evaluation scores þe- 
tween the first and second surveys 
yç7 the average value of d; for the “no-change” group 
NC 2 


ugr” the average value of d; for the group whose ex- 
pectations fell 


Table 4.—Upward Change in Grade Expectations Mean Ratings 


Variables 


First Evaluation Second Evaluation 


Means Means 
1. PREPARATION 4.4286 4.2857 
2. STIMULATION 4.2857 4.2143 
3. THINKING 4.1429 4.5000 
4. COMMUNICATED 4.0000 4.3571 
5, RESPECT 4.2143 4.6429" + 
6. ACCESSIBLE 3.9286 4.3571 
7. TESTS 4.1429 4.2143 
8. ASSIGNMENTS 4.0714 4.5000€ + 
9. LECTURES 4.2143 4.4286 
10. DISCUSSION 4.1429 4.6429? t 
11. PERSONALITY 4.3571 4.5714 
12. EDUCATIONAL 4.1428 4.1428 
13. PRESENTATION 4.0714 4.1428 
14. OVERALL 3.9285 4.2857 
15. RECOMMENDED 3.6667 4.0000? * 
16. FAIRNESS 4.3571 4.3571 
GRAND MEANS 4.1310 4.3527 
STANDARD DEVIATIONS .1855 .1828 
ap < 01 bp < .05 Cp < .10 


Hpg7 the average value of d; for the group whose ex- 
pectations rose? 


The constant term in this model represents the average 
value of d; for the "no-change" group, while the coeffi- 
cients of X, and X, represent, respectively, the difference 
between Upp and Hxc » and Upp and Myc: 


Table 5 summarizes the results of the experiment. The 
average value of d, (—.106) for the "no-change" group P 
not significantly different from zero, which means that 
student evaluation scores did not fall significantly between 
the first and second surveys among those students whose 
grade expectations did not change. However, the coef- 
ficients for X and X, are significantly different from 
zero (beyond the 01 level), indicating that the difference 
between Hg p and Myc and the difference between Mgg 
and {1 ¢ are both significantly different from zero. 

What this means is that the average difference between the 
first and second evaluation scores was significantly lower: 
relative to the average difference for the *no-change" 
group, for those whose grade expectations fell over the 
semester. Similarly, the second coefficient shows that the 
average difference between the first and second evaluation 
scores was significantly higher, relative to the average dif- 
ference for the “no-change” group, for those whose grade 
expectations rose during the semester. One significant 


Wu 


d- —406-313X, 4.327. 
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finding that stands out is that the impact on the average 
difference in evaluation scores of a downward change in 
grade expectations appears to be about the same order of 
magnitude as the impact caused by an upward adjustment 
in grade expectations, although the effects are reversed. 
That is, the coefficients for both variables are virtually the 
same. 

The explained sum of squares for this regression equation 
is simply the sum of squared deviations of group means 
from the overall mean of d; Thus, the R? for this regres- 
sion will be large when between-group variation of d. is 
large relative to the within-group variation. The value of R? 
for this regression indicates that approximately 50% of the 
variation in the dependent variable is being explained by the 
independent variables, X, and x, " 

Summarizing, during the semester, 61 of 108 sample 
students did not change their grade expectations, 33 
students revised downward, and 14 adjusted their expecta- 
tions upward. In the *no-change " group, the overall rat- 
ings of instructor and course did not significantly change 
he first week to the last week of classes. The 
“adjusted-down” students significantly lowered (.01) their 
final overall ratings compared to initial ratings. Conversely, 
in the “adjusted-up” group, students significantly raised 
(001) their evaluation on the final survey. Initial ratings 
of instructors and courses were the lowest in the “adjusted- 
up” group, followed by the “no-change,” and “adjusted- 
down" group. On the final evaluations, however, ratings 
were highest in the “adjusted-up” group, followed by the 
? and “adjusted-down” groups. The absolute 
the largest in the 
“up” and “no- 


from t 


« iiit 
no change 
value of the adjustment score was 
“adjusted-down” group, followed by the 


change” groups. 
Results were consistent wit 


ative initial (grade) expectation con 


h a hypothesis that a conserv- 
tradicted or discon- 


Evaluation 


ated Relative Differences in Instructor 
Expectations Groups 


Table 5.—Calcul 
Scores for Student Grade 


*p <.01 


2 
standard error of the 


(.070) (.098) (.098) l 
distribution of coefficients 


firmed by subsequent positive feedbacks tends to have a 
positive upward effect on instructor and course ratings by 
students. On the other hand, elevated initial (grade) ex- 
pectations disconfirmed by subsequent negative feedback 
tend to negatively influence instructor and course evalu- 
ations. Interestingly, the positive impact on teacher and 
course evaluations of an increase in student grade expecta- 
tions is approximately the same as the negative impact of 
a decline in grade expectations. 


Discussion and Implications 

All thirteen significantly different rating variables be- 
tween the two surveys in both the downward-and upward- 
change student groups moved in the direction of the change 
in student grade expectations. Therefore, results of this 
empirical study strongly support the research hypothesis 
that instructor and course evaluations by students tend to 
vary directly, when grade expectations are disconfirmed, 
with the directional change (either up or down) in student 
grade expectations. 

Several implications can be derived from this research 
for college and university teachers, administrators, and 
students alike. First, instructors would do well to guard 
against creating unrealistically high grade expectations for 
students at the beginning of a course. In fact, it may be that 
an opposite approach, i. e., generating low grade expecta- 
tions initially, will lead to higher student evaluations of the 
instructor and course. It is significant that more than 
twice as many of the rating variables (nine vs. four) fell as 
rose when student grade expectations were disconfirmed, 
either negatively or positively. Apparently, a downward 
decline in student grade expectations can be especially 
adverse to student evaluations of the instructor and course. 

Obviously, much more needs to be discovered about 
student expectations and their effect on perceptions and 
evaluations. It is hoped that this study will stimulate 
further investigation of the impact of changing expectations 
on student evaluations of instructor performance. 

Since instructor self-improvement should be the main 
purpose of student evaluations, with administrative decisions 
on faculty salary, promotion, and tenure secondary con- 
siderations, administrators should carefully review all 
existing evaluation systems to ensure that they are providing 
faculty with sufficient informational feedback to encourage 
action to improve teaching effectiveness. Mere rankings or 
percentile ratings on various dimensions provide little 
guidance to the instructor in determining ways to improve 
his performance. 

Students should be made aware of their responsibilities 
by administrators and faculty. The students should be en- 
couraged to exercise honest, mature judgment in eval- 
uating instructors and courses as precisely and comprehen- 
sively as possible. Too often, the student views the evalua- 
tion process as an imposition on his time—a task he wants 
to complete as quickly as possible. Seldom is more than 
ten minutes allotted to this individual evaluation procedure, 
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yet the outcome can be critical to faculty careers, admin- 
istrative decisions, and student satisfaction. . .and over the 
long-run, community progress. 
Administrators must use caution and judgment in 
attempting to compare one instructor with another on 
the basis of student evaluations. To date, the available 
research findings on ratings of teaching effectiveness are 
“incomplete, inconclusive, and of limited value" (5). Much 
more reliability than validity exists in most present instruc- 
tor evaluation systems. Further research on teaching ef- 
fectiveness needs to be emphasized in colleges and univer- 
sities in all departments, and findings disseminated to 
faculty, administrators, and students. This research should 
be conducted as part of a systematic and on-going program 
designed to identify relevant behavioral, psychological, or 
environmental variables directly related to student gains 
or objectives. Research may even indicate that a number 
of rating scales will be needed for specific uses, e. g., in 
specific subject areas, for different learning objectives, 
and for students of various backgrounds, levels, or need 
orientations. 

Nearly all participants in the educational process agree 
that instructor evaluation is a necessary and healthy 
activity, but the procedures and techniques in obtaining 
and interpreting data from the evaluation process need to 
be improved. Working together, faculty, administrators, 
and students can create a cooperative, as opposed to a 
competitive, atmosphere where progress can be rapidly 
made toward the advantage of all participants. 


FOOTNOTES 


1. The authors wish to acknowledge the advice and assistance 
of Dr. Kenneth E. Galchus, Assistant Professor of Quantitative 


Sciences in Business and Economics at Old Dominion University, 
Norfolk, Virginia. 


2. Program DSCRIM, see (14). 

3. See Jan Kmemta, Elements of. 
New York, 1971, 409-430. 

4. Ibid., pp. 410-415. 

5. Even though the present study included only one rating 
variable—#12—dealing directly with the course itself rather than 


the instructor, it has been found that students tend to rate the 
course and teacher the same (16). 


Econometrics, Macmillan, 


16. 
17. 
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ENVIRONMENTAL NUMBNESS | 
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ABSTRACT 


Following research o; i surr i 
of essen abide enum be cci de dud rH ania din the behavior of users of other institutions, a naturalistic study 
$ = . eri u 
students in a laboratory which had been slightly altered to S rm woe tved and recorded the behavior of university 
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HOMO SAPIENS is c : 
EU oe ENS s commonly and correctly regarded user that whateve " 
Fpecislly 4 ing shaper of the natural environment, change is BECVOE the setting, unpleasant or not, any 
a at m a contemporary society, when one gazes over of ludit eps Siro te BT genesis 
dis dri pe n or dam, the image of “conqueror of mate which L ren is related to the same societal 
mendously alte 1 ) seems a truism. People have tre- changes in ae to the stiff suppression of user-initiated 
. ` are : 
gummi old re án of earth 5 elements in active of People’s Pa rae of the environment, as in the case 
[ic the comforts and amenities first envisioned iy - p'e $ Fark in Berkeley and the custodian’s every day 
ee members of the species. r ape 
nly rece i 
hae ure ently has a broad awareness come that the Mer previous study (2) of student reaction to surround- 
shaper Is, in part, shaped by his creations. When planners ings nd that students will, in their dormitories handle 
are proportionately few and users are n t ne! an unliked but school-owned pice x ) na 
design of'a building a d. 1 ot consulted in the chair and stud des ned piece of furniture (a stiff 
8» danger arises that the needs of th ; y desk) by ignoring it. Most stude d T 
who must spend large A s hose especially those wi noring it. Most students, an } 
will not be dn oe iid ke es in the structure found logus pis miss nd pad quint averages, were 
Plo. while ive. ba: oor, the j 4 : 
neath the imposing skyline quers watte man lives; be € present study was desionel oa a lounge chair. 
HB Skyline, micro-environments of dubi ames y was designed to investigate student 
comfort and dignity exist. Not all these are in slum d behavior in a situation where no alternati cdi il bl 
i — Ft san the had . alive was available; 
ghettos; new and architectural award-winning struc ey had to either accept or alte i : 
have come in for their just share of the aiiim ^ was hypothesized that d the eie pes 3 
Sommer (4) has discussed numerous types of m students would not substantially i "m ix 
macer vers Br pig ie arrangement of funde, Hi cnc despite their membership m "eda : 
has apparently played an important role in behavic. . igh environmental awarene: : : 
Piet He found n e e in behavior and awareness group. 
comm ee alice on the part of in- 
stitutional authorities, but rather an ignorance of the Method 
principles of design coupled with a de facto default in this A uni | 
` ^ pa inte universit Sei ` 
matter on the pat: of mainte gags workers, who are not conducts did, course in experimental psychology routinely 
interested in facilitation of learning in the users, in eacl a didactic experiment in short-term memory earl 
Users of public and semi-public buildings seem to de- tics Semester. The laboratory used for this pu Es af- ; 
velop an "environmental numbness" (5) to unpleasant prias tt for naturalistic rerien dnd ap- 
; 4 arr: ments i " mis- a pe 
sounds, sigh ts, and arrangements. In e informal exper- chairs in the Fe aes of furnishings, The tables and 
iment, visitors were seated in front of a very annoying student ens eager are of light construction, which no 
fan. None complained, but when finally the sound was A ave difficulty in moving. The fl " 
ar À : sists of waxed and li k g oor con 
consciously brought to their attention, nearly all acknowl- "Thslaborst polished tile. 
edged its unpleasantness. Sommer (4) feels that prolonged distibdion aoe procedure calls for a pre-experimental 5 
exposure to an institutional setting tends to lead to “in- discussion of r H e experiment, and a post-experiment : 
l g on the part of the in th sults. The first and third parts take place 
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Figure 1.—Schematic Diagram of Laboratory 
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experiment occurs in cubicles constructed from movable 
partitions along the sides of the room. During the es 
iment, pairs of students test each other, alternating "an 
les. 
s p to four laboratories sections, the tables and 
cubicles were carefully arranged as in Figure 1. This ar- 
rangement was on the whole the same as the students 
had previously experienced, except that all passages were 
reduced to the minimum necessary for a slender to 
medium-sized male to pass through without being forced to 
move the furniture. Thus, at each asterisked gap in Figure 
1, the distance between articles of furniture was set at 
6-7 inches. In order to pass, as the students had to do to 
move from central tables to cubicles, it was necessary to 
turn sideways and, depending on girth, maneuver carefully 
to avoid moving furniture. Of course, all students also had 
to pass through the frontal barrier from the entryway to 
their seating places at the beginning and end of the class, 
or to consult with the teacher if he happened to be in 
front of the barrier of tables. 


use of the 
greater distance, allowin 
its occupant plenty of room to move. Although this 12- 
o close for ease of move- 
alistic investigation of 
was made in situations whe 
felt free to set this distance, or at least had no i 


physically blocking the back of their chair, 
In an academic office complex, an observer walked 
through the corridors and at every open door quietly asked 
the occupant, if they were involved in deskwork, not to 
move. After explaining that it was not a hold-up, the 
observer measured the distance where the person had been 
sitting (back of chair to edge of desk). The mean distance 
was 19.1 inches, with a standard deviation of 4.8 inches, 

Only one of 32 people sat at 12. inches or less. It was op. 

served, incidentally, that a Strange sex difference seems to 

obtain in desk seating patterns. Nine of the 32 people sat 
at an angle to the desk (their distance was to the center 

of the chair back), and eight of these were male. The sam- 

ple was equally divided between males and females. This 
difference did not seem to be task-related as all people 
except one were reading or writing—the one typist cannot 
explain the female tendency to sit with evenly placed 
chairs. 

d In the carefully designed inhospitality of the laboratory, 
the instructors measured how often and how much fur- 
niture was adjusted. The cubicles were examined after class, 
and the central area tables were examined after the pre- 
and rimental discussion (while the students were involved 
i R in the cubicles), aap s mid-experiment 
switch in /-S roles (which necessitated students coming 
switch cubicles and, often, through the frontal harriers, 
out of the Gunite 4 aie] the cubicles), and after class. 

as well as switching chairs in P 

as we ion and measurement was done covertly. At 
The observation a spitable distances were reset when 
these times, the inhospitable ded 

they had been adjusted and recorded. 
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mpediment 


Three instructors participated after explanation and l 
training in the experimental procedure. Thirty-four students 
unwittingly served as Ss. The following week they were in- 
formed of the experiment and queried as to their recol- 
lections of the experience. 


Results 


The two measures used, frequency and amount of 
furniture adjusted, were applied to two types of furniture, 
the central area tables and the cubicle chairs. Since the 
cubicle chair distance could only be increased by pushing 
the cubicle tables away, the data actually consist entirely 
of amount and frequency of table movement. 

An estimate of total tight-squeeze passages was first 
made, in order to compare that number with central area 
table movements made. Each student had to enter and 
leave through the frontal barrier, and make the return trip 
once in mid-class for picking up role instructions in the 
memory experiment. In addition, each student had to make 
three entries and exits from a cubicle for the same reasons. 
Beyond that, students often emerged from the cubicles 
with a question, but as no count of the exact number of 
these queries was made, they are not included in the total. 
The total, a conservative one, is 238 passages through bar- 
riers no more than 6-7 inches wide (7 passages X 34 
students), 

The frontal barrier table gaps were adjusted by students 
exactly twice. The cubicle-entry way passages were adjusted 
twice by moving tables and three times by moving a cubicle 
wall panel. In all seven cases, adjustment just sufficient for 
passage without turning sideways was made. Thus, slightly 
more than 97% of all Passages yielded to the position of 
the table and whatever else formed the other half of the 
tight gap. In less than 3% of all passages did students fail 
to accept this Scylla and Charybdis situation. And then 
they only moved the tables barely enough to squeeze 
through themselves, None of the 34 seemed remotely close 
to suggesting that the whole situation was uncomfortable 
or changing the room as a whole. Of course they had never 


received any direct communication that such behavior was 
not allowed. 


i The cubicle chair-table distances were, if any thing, rel- 
atively tighter than the central area table distances, and a 
little more adjustment was observed. The mean adjustment 
from 12 inches was 1.9 inches. When the distribution, 
however, was skewed, 70% moved their distance 2 inches 
or less. Essentially, a few people moved the table quite a 
bit and most moved it not at all or only incidentally, 
perhaps accidentally, The most adjustment, to 17 inches, 
was done by three subjects. This is still 2 inches less than 
the mean of the naturalistic observation. The difference 

etween the means of the 32 naturalistically observed 
people involved in deskwork and the 17 students’ chairs 
(used by 19 students because some two of them switched 
chairs when the E/S roles were switched in the memory 
experiment) was significant (t= 4.17,p< -001). 


Discussion 


The data suggest quite 


strongly that students in a class 
room will repeatedly 


(seven or more times) accept an im- 


p" 
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pediment rather than adjust it to levels of comfort. Most 

of these students also accepted, in the same space of about 
90 minutes, an uncomfortable seating arrangement in cu- 
bicles. They spent a great majority of their sitting and walk- 
ing time in the class experiencing and yielding to minor 
barriers of furniture. None of them made more than a 
short-range adjustment of tables and chairs to accommodate 
his or her own body at the moment, and even these subjects 
were very rare. One instructor-experimenter noted that 

two of the seven adjustments were a necessity—the student 
was simply too large to fit through a 7-inch gap. 

All the observers noted student efforts to avoid moving 
the furniture, such as grunts, swiveling of hips, and willing- 
ness to line up for passage through a tight squeeze. The 
tables came to seem magically immobile; one knew they 
required only a tiny amount of effort to move, yet they 
withstood over 238 carefully maneuvered people-passages. 

The following week when all students were told of the 
experiment and asked to recall their experience of it, sur- 
prisingly few (one) even remembered there being any form 
of impediment. Others were ata loss to recall it, although 
one volunteered the explanation that perhaps the tables 
were "supposed" to be that way. In their previous classes, 
tables and chairs were relatively disordered, with large 
handy gaps, as the author discovered when he began to 
set up a thorough system of impediments. If the present 
results are generally valid, one wonders how long it would 
take for students to get the tables disordered! (Of course 
maintenance workers might change table positions in the 
course of their duties.) 

Why students adjust to furniture rather than adjusting 
it is not clear. The differences between the naturalistically 
ork situation and the experimental situation 
ses worth further investigation. 
Possibly, in student perception, institutionally owned 
furniture is noL a part of the student’s personal area of 
control. Yet the offices, where movement had been observed, 
also contained furniture not owned by the individuals. The 
differences which are salient are (a) that furniture is 

ived as within p al control in an office and not 
pereen ^t the office is an individual (or 
‘a classroom and (b) that the offi t 
: "ha a twosome) domain, while the class is a group of 
perhaps a ably the office group was an older group, 


observed deskw 
provide several hypothe 


erson: 


people. Pr 


and this indirectly or directly mediated the results. How- 
ever, there is little doubt the experimental distances were 
below the comfort range for most people. 

If task-involvement in the memory experiment, to the 
detriment of personal comfort, is advanced as a hypothesis, 
another implication arises. Though no check of student 
attitudes was made in this study, one would expect such 
repeated minor discomfort to develop into a variety of 
irritations and negative attitudes among the students. If 
they do not know why they feel badly toward a given class 
or situation, they are apt to ascribe it to whatever is most 
handy—the teacher, the school, their classmates. This 
could be the beginning of an unfortunate deterioration in 
whatever valuable relationships other efforts in the school 
had begun. The example used in this study, slight frequent 
altercations with tables, is not in itself significant; yet it 
may typify a range of subtle frustrations in classrooms 
which are below the threshold of awareness for all con- 
cerned. But if they are pointed out or discerned through 
a careful survey of the physical plant, even a new award- 
winning one (4), they can often very easily be changed 
or at least ameliorated. 
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AN A PRIORI APPROACH FOR DEVELOPING 
SHORT-FORMS OF TESTS AND INVENTORIES 


JULIAN L. BIGGERS 
Texas Tech University 


ABSTRACT 


numerous advantages to the researcher. 


design of a particular study. The obvious solution would 
be the development of an abbreviated or short-form of 


that the traditional process for developing a shortened 
scale involves more labor than his originally planned re- 
search, and he thus abandons measurement of the trait 

as part of the research. The purpose of this article is to 
point to a simplified procedure that may serve adequately 
for developing a short-form of a test when the basic as- 
sumptions are met. 

The usual procedure for devising a short-form of a test is 
empirical in nature. The task involves (a) administration 
of the original instrument to a sample of examinees sim- 
ilar to the target population of the study; (b) item analysis 
to identify the items with highest association with the total 
score on the parent test; and (c) selection of the requisite 
number of items for the short-form, Additional work is 
required to establish the derived instruments reliability 
and validity. Obviously, a great deal of labor and com- 
putation is required before the final product is obtained. 

The desired end product of the effort is to produce a 
miniature parallel edition of the full-scale. If the parent 
rae it is viewed as being made up of n parallel short- 
fee dies task of the researcher becomes that of identify- 
fonat cai ; the length requirements for his study. It 

Hold be divisi that the empirical approach "d not 
"i ish this task. [tems identified and selectec with 
ums item-total score correlations E likely to 
ais "nuam h risform that is not parallel with other pos- 
Ae npa made up of the items not so selected. 
Là ia sist develops a new short-form with some de- 
E BC mii validity with, but not parallel to, 
domanda Applying the reverse process should make 
e full-scale. 


the point clear. Starting with the short-form and adding 
new items with similar characteristics to produce an in- 
Strument as long as the original full-scale should produce 
an instrument with statistical characteristics different 
from the original full-length edition, Cloaking an empiri- 
m with all the characteristics of 
the parent instrument is a tenuous proposition at best. 
The theoretical basis for the S 
formula provides a simplified alternative to the empirical 
approach for developing 2 truly parallel short-form of a 
test. The Spearman-Brown formula is most often cited 
in the literature in association with estimating the re- 
liability of lengthened tests. Overlooked is the practical 
s apply when reducing 
the length of an instrument. Gulliksen (1) points out that 
met, the Spearman-Brown for- 
"sults, not an estimate of the reliability 
the Spearman-Brown formula re- 
idimensional in the trait measured 
be homogeneous with those re- 


e for producing a short-form via 
Spearman-Brown theory 


or assuming the reliability of the full-scale for the population 
to be examined; (5 ) estimating the reduction in scale length 


nea priori m 


tethod has obvious advantages over the 
empirical metho, 


d. Only an estimate of the reliability of 
the total scale for the target population is needed. This 
requirement may occasion the administration of the in- 
strument to a sample, but that task is inherent in both 
procedures. When the reliability is already known for 
similar Samples, the researcher may be willing to accept 
this estimate without further effort. All remaining work 
for estimating the reduced length or reliability and the 
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sampling procedure to select items can be carried out in 

a few moments at the researcher's desk. Finally, the re- 
sulting short-form will have reliability and length charac- 
teristics dictated by the researcher in advance. Using the 
empirical approach, only the length might be determined 
in advance; the reliability would have to be obtained after 
actual use of the short-form. 

Adherents to the empirical method may concede that 
the foregoing is true, but will likely point out that although 
their way is arduous, expensive, and time consuming, the 
method results in the identification and selection of items 
most appropriate for the population to be tested. The 
assertion is not denied, but it may, as discussed previously , 
be overstressed. A high coefficient of internal consistency 
suggests a strong item-total score correlation, and the Spear- 
man-Brown approach obtains a sample of those items to 
produce a parallel form with a predetermined reliability. 
Neither approach, empirical or a priori, would be appro- 
priate if the full-scale had a low reliability coefficient 
for the population. 


The Problem 

The basic question to be answered is, How will a short- 
form produced by thea priori Spearman-Brown method 
compare with an empirically developed instrument? To 
obtain an answer, a short-form of Rokeach’s Dogmatism 
Scale (2) was produced and compared with two empirically 
produced versions. 

Schulze (3) developed a 10-item version of the Dog- 
matism Scale using Guttman’s scalogram analysis as the 
mode for empirically selecting the items. Troldahl and 
Powell (4) produced data for a 20-item scale and also for 
even shorter versions using the regular item analysis tech- 
Schulze tested college students in his work, while 
Troldahl and Powell studied adult residents of Lansing, 
Michigan, and Boston. The two empirically developed 
scales have only four items in common demonstrating the 
variable item values obtained in different populations and 
with different item-selecting techniques. Neither study 
reports direct evidence of reliability of the short-form in 
the usually accepted sense. Schulze reported a Meer 
of reproducibility (.83) and pointed to similar p 
tained in two separate studies as evidence of the Là ial i m 
of his 10-item scale. Troldahl and Powell used a statistica 

: : ‘ance estimates (described later) to 
procedure involving variance heiss Die 
conclude that their 20-item instrument would have a sp! 


half reliability of 19. 


nique. 


Methodology 


: f the « dem 
by Aa uis mid odd-numbered items from Rokeach’s 


Raven Bal scale. The item-selection eat will E 
recognized as the traditional pre en 
5 » as tions and is a simpler © 
requirements of the assump à ized item- 
process than the more : l orae oped scale, 
selection procedure. The resulting 2 aa sip um 
using 50% of the available items; had cl n - a er 
level commonality with the two empirica i Hd In of is 
scales. Four of Schulze's ten items (40%) and n 
twenty items (4596) in the Trol 


Dogmatism Scale was developed 


dahl and Powell scale ap- 


peared among the odd-numbered items selected for the 
experimental short-form. 

The twenty odd-numbered items from the Dogmatism 
Scale were merged with twenty items from the F-Scale 
to produce an instrument similar in length to the full 
Dogmatism Scale. The experimental short-form and the full- 
length Dogmatism Scale were administered to under- 
graduates enrolled in an educational psychology course. 
‘A two-week interval occurred between administrations. 
Scores were obtained for the experimental short-form, 
the full-length inventory, and the odd- and even-numbered 
halves of the full-scale. Product-moment correlations 
were computed between the four sets of scores thus ob- 
tained. Table 1 summarizes the results of the analysis. 


Table 1.—Intercorrelations of the Short-Form and Full-Length 


Correlations between 
Test Forms 


Test Forms c 


. Short-Form (Experimental) 
. Full-Length Scale Fy 1.00 92 .93 


. —Odd-numbered Items 


o0 0 we > 


. —Even-numbered Items 


Analysis of Results 

The correlation between the odd and even halves of the 
full-length Dogmatism Scale when inserted in the Spear- 
man-Brown formula produced an estimated reliability 
coefficient of .83 for the full scale. This value is within 
the range of reliabilities reported by Rokeach (2: 89-90). 
The correlation of .78 between the experimental short- 
form and the odd-numbered items of the full-scale may 
be interpreted as the test-retest reliability for the ab- 
breviated scale. In a similar fashion, the correlation of the 
experimental short-form with the even-numbered items 
could be considered an estimate of the alternate form 
reliability after a two-week interval. Lastly, the experi- 
mental form’s correlation of .75 with the full-length ver- 
sion is the estimated predictive validity coefficient. The 
reliability estimates (split-half, test-retest, and alternate 
form) all appear to be in an acceptable range to warrant 
use of the short-form in group studies. The predictive 
validity coefficient is as high as that found in many studies 
of this nature. 

The statistical procedure use 
was followed in obtaining reliability e 
three instruments for comparison purposes. The full- 
length Dogmatism Scale in this study had a corrected 
split-half reliability of .83, which indicates that approx- 


imately 69% of the total variability is explained by the 
attributes the items had in common. A correlation of .92 


was obtained between the odd-numbered items and the 


d by Troldahl and Powell 


stimates for all 


10 


full-scale. This indicates the 85% of the variability in the 
full-scale is explained by the odd-numbered items. An 
estimate of “true” variability represented by the odd-i tems 
is 58% (.69 X .85). The square root of this percentage is 
ad estimate of the split-half reliability of the experimental 
short-form which is .77. When Troldahl and Powell fol- 
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reliability of the full-scale should be high enough for the 
popularion to be surve 


yed to allow for reduction by short- 
ening. 


A short-form of Rokeach’s Dogmatism Scale was pro- 
duced by selecting the twenty odd-numbered items. The 


lowed this procedure, they obtained an estimated split- 
half reliability of .79 for their empirically derived 20- 
item scale. The 10-item Schulze scale has an estimated 
reliability of .62 applying the data available in a similar 
fashion. Attenuated to double length for cor 
an estimated reliability of .76 was obtained 

All three short-forms have comparable reliabil 
mates. The slight variations might be attributed t 
error or differences in reliability of the 
for the populations tested. 


ity esti- 
original full-scale 
Summary 


The use of the Spearman-Brown the 
short parallel version of a scale se 
The parent instrument 
assumed unidimension 
intercorrelations of ite 


ory to produce a 
ems to be warranted. 
should fulfill the requirements of 
ality and general equivalence of 
ms to be eligible. In addition, the 


mparison sake, 


o rounding 


a priori-developed scale was shown to have satisfactory 
statistical properties for a half-length edition. The ex- 
perimental scale 
of two empirically developed scales even though item 
overlap between 


the scales was near the chance level. 
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PEER JUDGMENTS OF TEACHING COMPETENCE 
AS A FUNCTION OF FIELD INDEPENDENCE 


AND DOGMATISM ”? 


JAMES B. VICTOR 
State University of New York at Albany 


The theoretical relationships among field inde; 
were examined. The subjects were master's level i; 
and indicate that these interns differentiate between Professi 
making peer choices. Neither dogmatism nor fi 


ïeld 
variables significantly predicts the criterion. It is the 


field dependent/hi 
peers, while the field dependent/low dogmatic person is chosen more often, tic person 


IN RECENT YEARS considerable attention has been 
paid to the issue of competence criteria for teachers, 
social workers, and others who work with children. Sey- 
eral authors have indicated peer judgments of preferred 
work-partners. in training situations, to he tte to m i 
tain interpersonal qualities, such as self-disclosure and inter- 


pendence, do and a peer jud 

interns in a training Program. The d 2l 
e onal compe 

independence alone p 
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gmatism, 


ent criterion of professional competence 
E ta illustrate the reliability of the criterion 
tence judgments and more interpersonal judgments when 
redicts the crit ut the interaction term for the two 
£hly dogmati who is chosen less often by his 


personal flexibility (10) and adaptable teaching behavior 
(11). The Present study was designed to expand the 
network of such in terpersonal variables. 

rogram for training teachers of emotionally 

d children at SUNY Albany is well-suited for 


g Sociogram data. For one semester, each intern 


The p 
disturbe 
gatherin 


had a reliability coefficient similar to that 
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works closely with six to eight other interns in one of 
three settings for emotionally disturbed, neurologically 
impaired, or behaviorally difficult children. The interns 
share responsibility for program planning, case confer- 
ences, and the day-to-day workload. In addition to the 
intense daily work contact, the interns are together dur- 
ing academic course work. This setting provided the 
opportunity to discern who would come to be judged the 
more competent interns in the group. 

Witkin (21) and his colleagues have described a dimen- 
sion which they call field independence. This variable dif- 
ferentiates individuals in terms of their active striving, an- 
alytic attitude, and degree of self-awareness. In terms 
of interpersonal functioning, Witkin (20) states that “the 
parate identity of persons with a 
global cognitive style manifests itself in reliance on external 
sources for definition of their attitudes, judgments, senti- 
ments, and of their view of themselves.” A number of 


studies have illustrated that field dependent persons are 
ided by positions attributed to an 


cer group (1,3, 13), remember verbal 
messages that are more social in context (4, 5, 6, 7, 8), and 
adapt their performance on à cognitive task to a modelling 
demonstration viewed on TV (19). While these data do 
clarify a person's characteristic interpersonal style, they do 
not give in formation as Lo the way the person will be 


less developed sense of se 


more prone to be gui 
authority figure or p 


viewed by others. 

Rokeach (18) describes another dimension of cog- 
nitive style which has been viewed as important in interper- 
sonal functioning. This construct, dogmatism, is thought 
to be related to a person's openness to new ideas and to 
the independent evaluation the person is able to make 
on incoming information. Dogmatism and field inde- 
pendence have both been seen as important constructs 
in teachers” interpersonal functioning. Measures of dog- 
matism and field independence share little variance with 
each other and display very low correlations with meas- 
ures which purportedly assess open, other-centered at- 
titudes and behavior. Clearly the views of those who define 
ictioning in terms of relative isomorphism 


al openness and such constructs as 
1d independence are overly simplistic. 
nce and dogmatism are essentially 
possible to identify individuals represent- 
of levels on both constructs. Several 

this conceptualization have found that 


d dependent persor 
) In these studies 


has been found 


interpersonal fur 
between interperson 
dogmatism and/or fie 

Since field independe 
uncorrelated, it is 
ing combinations 
studies employing p 
it is the high dogmatic/he 
be different from erm po aia 

shi i enden s 
the high dogmatic/fie d ki rsal and non-reversal shift 


m z ve 
aud ty with both re : . «^ 
to have difficulty -. score low on inventory scales 


- ation problems, $ : 
ros aiiis or dy namism of damen teaching 
behavior, and score lower on a creativity oo acht (14) 

It was the aim of this study to put me Wann i : 
conceptualization to à direct test of whether os 
pendence and dogmatism taken together were relate 


a who tends to 


peer judgments in a teaching situation. The hypothesis 

that the interaction term of field independence and dog- 
matism would be related to peer judgments of com- 
petence was drawn from the earlier Ohnmacht studies. 

In particular, persons with combinations of field dependent 
[high dogmatic scores were viewed as most likely to receive 
a lower number of nominations. Also, because of their 
reliance on others for self-definition, field dependent in- 
terns who scored low on dogmatism were predicted to 
receive a higher number of nominations. 


Method 


Subjects 

The Ss were 50 master’s level students in an intern train- 
ing program for teachers of emotionally disturbed children. 
All students accepted into the program for a two-year 
period were included in the study. Program selection was 
made by usual procedures of test scores, previous academic 
records, and interviews. Selection staff were unaware of 
the students’ scores for both of the variables used in this 
study. All Ss were enrolled, after selection, in the same in- 
ternship teaching practicum. 


Procedure 

Before the program began, all Ss were administered the 
Hidden Figures Test (HFT), a measure of field independ- 
ence (12), and the Dogmatism Scale (DS), a measure of 
openmindedness or dogmatism (18). 

After one semester of practicum experience each S 
was asked to nominate other in dividuals in his work group 
as his first, second, or third choice as a partner for various 
activities, that is, the person with whom S would prefer to 
(1) teach emotionally disturbed children: (2) develop program: 
programs for emotionally disturbed children: (3) work 
with as a consultant in regard to emotionally disturbed 
children; (4) talk to about a personal problem: and (5) take 
to a party. 

The inter-judge reliability was determined for these five 
questions for each group using a formula provided by 
Gordon (9). The judgments for “teach,” “consult,” and . 
“develop programs” were very similar, with reliability 
coefficients each ranging from 49 to .95 with the median 
at .74. The judgments for “take to a party” and “talk to 
about a personal problem" were less reliable, ranging from 
.32 to .50 with the median at 42. 

The factor analysis using varimax rot 
orthogonal factors, The first, labelled professional com- 
petence, showed loadings > -83 for the peer choices of 
teaching, consulting, and program developing. The second 
factor, interpersonal-social, loaded > .85 for choices of 
taking to a party and talking to about a personal problem. 
The correlation matrix and factor loadings are illustrated 
in Table 1. Two new individual difference variables were 
formed using factor score 
judgments and interpersona 


ation yielded two 


s for professional competence 
Lsocial judgments. 
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Table 1.—Correlation Matrix and Factor Loadings for Sociogram Items 


Sociogram Items 


Develop program 


Teach .84 
Consult .83 
Personal problem .16 


Take to party 


*Factor I accounts for 56% and Factor II 21% of the total variance 


Table 2.—Correlation Matrix for Variables 


Correlation Matrix 
Variable 


Professional competence 
Interpersonal-social 
Dogmatism 

Field independence 
HFT X DS 


*p «.01 


A regression analysis 
variables. The independ 
HFT X DS, the interact 
performed against each 


(2) was performed on five final 
ent variables were HFT, DS, and 
ion term. A separate analysis was 


of the criteria, professional com. 
petence and interpersonal-social judgments. 


Results 


The correlation matrix for the 


five final variables used 
in the study yielded one corre! 


lation with a statistically 
significant value (professional competence and HFT X Ds: 
r= .41, p < .01) as can be seen in Table 2. 

The full model regression analysis yielded a significant 
effect for the criterion professional competence judgment, 
as can be seen from Table 3 (F = 5.09, df = 3/46, p< -005). 

The main effects of HFT or DS did not reach levels of 

significance; however, the HFT X DS interaction term 

was statistically significant (F = 12.53, df = 1/46, p< -005). 
regression analysis of the interpersonal-social judgment 

criterion did not yield even nominal levels of significance, 

A median split technique was applied to HET and DS 
scores to test the hypothesis that persons scoring low on 
HFT and high on DS would receive fewer peer choices 
than those scoring low on both HFT and DS. The 
score used for each S. was his average nomination for the 


23 
ll 
.89 
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choices of teaching, consulting, and developing programs. 
The means and standard deviations for each of the four 
cells are presented in Table 4. The difference between the 
low HFT-high DS and the low HFT-low DS groups 
yielded a statistically significant quantity (t = 1.96, 

df = 24, p « .05, one-tailed test). 


Discussion 


The teachers in this study did differentiate between 
professional competence judgments of their peers, which 
were more reliable, and interpersonal-social judgments. 
The work conditions of the groups did vary in the amount 
of close contact that each intern had with his fellow 
Workers, and this seemed to affect the inter-judge reliabil- 
ity estimates, Iri general, the closer the contact during the 
work experience, the higher the correlation coefficient. 

The present data are consistent with the Ohnmacht 
studies (14, 15, 16, 17) that persons who score high 
on the dogmatism scale and are field dependent are the 
most predictible group, and they provide weak support 
for the idea that these variables when considered together 
provide useful information about which teachers will be 
valued by their colleagues as professionally competent. 
Interns with the particular combination of high dog- 
matism/field dependent scores were chosen less by their 
peers, while those who had low dogmatism/field dependent 
Scores were chosen more often. It is noteworthy that the 


only three interns in the sample who were viewed as 


isolates by their Peers were in the high dogmatism/field 


Table 3.—Multiple Regression Results for the Professional Competence 
Peer Judgment Criterion 


Predictor 


Dogmatism —.21 

HFT 222ns. 1/46 2 

HFT X Dogmatism 12,53* 1/46 46 

Full model 25 5.09* 3/46 

*p «.005 

Table 4.—Average Peer Selection Scores for Choices of Teaching, 
Consulting, and Developing Programs for Ss in Cells, Using Median 
Split Technique on HFT and DS 


Field Independence 


= — 


VICTOR 


dependent group. In summary, the study confirms the 
hypothesis that the personal characteristics of dogmatism 
and field dependence contribute to competence judgments. 
However, it is the interaction of these variables that is the 
determining factor. 

FOOTNOTES 


]. Portions of this paper were presented at the Meeting of the 
Eastern Psychological Association, Washington, D. C., 1973. The 
author wishes to thank Dr. Oliver M. Nikoloff for providing 
valuable assistance in support of this study. 

2. Requests for reprints should be sent to the author's 
address: Department of Educational Psychology and Statistics, 
State University of New York at Albany, 1400 Washington Avenue, 
Albany, N. Y., 12222. 
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CAN SUGGESTIONS BY TEACHERS 


IMPROVE INSTRUCTION? 


JOHN D. McNEIL 
University of California, Los Angeles 


ABSTRACT 


Twenty-four experienced teachers each selected a different re. 
teaching the skill selected. Subsequently, 24 unsuccessful teachers 
ment in the skills, were identified, These unsuccessful teachers we; 


received the previously prepared suggesti 


TEACHING HAS BEEN a very private kind of work. 
Customarily , teachers have worked in different rooms at 
the same time, thereby making it difficult to see each 
other teach and, accordingly, impossible to help each 
other on the basis of direct observation. Furthermore, as 
indicated by Dreeben (2), teachers have lacked written 
media for communicating about their work because 
their occupation has had no counterpart to the scholar’s 
research tradition in which knowledge is accumulated 
in books and journals, or to the physician’s case records 
in which tests and prior medical decisions are documented. 

Now, however, there are signs that the fragmentation 
of the colleague group may diminish and that teachers 
may not be left alone to determine what they are doing 
right or wrong. For example, new guidelines of the Right 
to Read program stress staff development involving all 
school personnel in activities directly related to e ery day 
classroom instruction (1). Team teaching, diversifie 
staffing, videotaping of lessons, faculty intervisitations, 
and teachers’ centers are other Innovations that promise 
to make it easier for teachers to aid one another in school 


settings as they work through their instructional problems. 


Although it is a well-substantiated fact that teachers 
continue to receive most of their assistance for self- 
recognized weaknesses from their peers or from their own 
trials and success (3), there is little evidence that the help 
received makes a difference on the growth of pupils. We 
do not know whether suggestions from peers are valid, 

i. e., contribute to pupil achievement. Indeed, teacher 
preference for peer assistance might be nothing more than 
a defense against such options as (a) iria pi ho 
threaten to reveal inadequacy ; (6) unreal E » ege ap 
perts freed from the demands of real-life chil reny ane ; 
(c) administrators who are sure to ii i re n» ed 
weakness when completing an annual evaluation form. 


ding skill and each prepared a set of written suggestions for 
» Le., those less able than their peers to effect pupil achieve- 
re randomly assigned to two treatment groups. One group 


ons and the other group did not. Later, the two groups of teachers again taught the 
skills. It was found that all teachers receiving Suggestions improved relativ, 


achieved more, while only slightly more than half of those w 
concluded that suggestions for teaching can be helpful to less successful 


——— 


€ to their previous performance, i.e., their pupils 


ithout suggestions showed improvement (p <.05). Thus, it was 


teachers, J 


The purpose of this study was to determine whether . 

teachers could provide suggestions that would improve the 
ability of other teacher to effect pupil progress. The study 
Was constituted to provide for. vide generalization within 
the field of reading; i. e., a large number of suggestions | 
were made for teaching many skills of reading at different 
grade levels and these suggestions were given to teachers 

with varied backgrounds. The following design factors 

were employed to maximize the possibility ot finding T 

value in teachers’ suggestions: (a) Suggestions were specific j 
to the teaching of particular reading skills which were 
Operationally defined; (b) Each teacher who provided the 
suggestions for a given teaching task was very familiar with 
that task; and (c) The population of teachers within which 

he effect of suggestions was to be noted consisted of 
teachers who needed help because they were inferior to 
their peers in teaching the particular skills. 


Method 
Subjects 


The “suggestors” were 24 teachers who provided 
written Suggestions. They were all experienced teachers 
in the Los Angeles area who were candidates for a 
Master's degreé in the teaching of reading. Their ages 
ranged from 22 years to 56 years, and they were character- 
ized as highly verbal, scoring above 48 on the Miller 
Analogies Test. 

The teachers selected to receive or not receive sug- 
gestions, i, e., the experimental or control teachers, were 
those whose pupils did not achieve under their direction 
a5 well as other pupils taught by other teachers. These 
unsuccessful teachers had a wide range in background. ] 
Some of them had taught for more than 14 years, others 


c 
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were beginning teachers, and a few were instructional 
aides. 


Tasks and Materials 


Each of the 24 suggestors also provided a teaching 
performance task. Each task was in the form of a mini- 
lesson consisting of a measurable instructional objective, 
a sample test item, information regarding the importance 
of the objective as a reading skill, and background in forma- 
tion that a teacher could use in planning the lesson. It 
was made clear that the person who would teach the task 
was free to design her own lesson within the constraints 
of the objective. Each task was intended for a particular 
population of learners, e. 8», children in kindergarten, high 
school students in remedial reading. It was specified that 
the teacher should have 30 minutes for preparation and 15 
minutes for teaching the lesson. The objectives of the 
tasks included recognizing open and closed mouth sounds; 
identifying words that rhyme; matching initial sounds of 
spoken words with pictures whose names begin with the 
same sound; discriminating vowel sounds of different 
printed words; applying the “final e" rule; arranging 7 
pictures in sequential order according toa story; identify- 
ing compound words; recognizing spelling patterns and 
using them in decoding new words; identifying statements 
of fact and opinion; differentiating specific and general 
words; using guide words; distinguishing homonyms; in- 
terpreting metaphors; identifying thesis sentences. 

Prior to preparing their suggestions for teaching the 
different lessons, the suggestors made task analyses for 
themselves in order to identify the prerequisites, that is, 
to see what was involved in the task. They also con- 
structed 10-item criterion referenced tests with which to 
assess pupil attainment of each objective. In most in- 
ese 24 teachers composed and tried out their 
that the tasks were appropriate for 
o gain confidence in the pro- 
uld suggest to others. 


stances, th 
own lessons to ensure 
intended learners and t 
cedures which they wo 


Written Suggestions 


The suggestions we 
re followe 


re task specific. General principles, 
d by examples of how the gen- 
din the particular case. Suggestions were 
written for each of the following principles: pees . 
purpose; motivational appeal, e-£^ E arom D 
personal experiences» humor; Roo ye za ale 
responses, including manipulation; tem nen c - 
ee sequencing E La iere dacie and de- 
1 n; use of mner SH pane ces 

i aire appropriate ee e; analogous progie ; 
knowledge of results; prompting» pen ui child 
ing both irrelevant practice and attending 
to the loss of others. 

By way of example, th 
for teaching the mini-lesson 
appear below: 


when given, We 
eral term applic 


e suggestions that were given 
Ue : ES 
ng compound w ords 


*"recogniz 


ds on the board. Let the 


i ron 
e several com yound wo l 
flied KE nt. Welcome m- 


i S ne 
children look at the words for a mom "s 
quiries or deductions regarding the words. 


2. Ask children what is special, common, etc., to all 
the words. 

3. If necessary, prompt the children to recognize 
that the words on the board have words within words— 
two words put together. Reinforce any comments made 
with regard to two words in one, words put together, etc. 

4. Ask children if they know what these words are 
called. 

5. Define a compound word. Ask children if they 
can think of some compound words on their own. Ask 
children to identify the words that made up the com- 
pound words supplied. 

6. Try to distract the children by mixing compound 
words with words that contain prefixes, suffixes, and 
poetic prepositions. Contrasting prefixes and suffixes to 
components of compound words is always insightful. 

7. Write a list of words on the board and have children 
come up and circle the compounds and divide their parts 
with a slash; or pass out a little quiz and have children 
complete it at their des Correct the exercise together 
immediately. 

8. Problems: If children mix up compounds with 
base words with prefixes and suffixes, stress the fact 
that compound words are made up of more than one 
word. Contrast this with the prefix or suffix which is a 
tag-along. For example, can the two parts of this word 
stand by themselves? preview —pre/view. Can the two 
parts of this word stand by themselves? moreover— 
more/over. 


Procedure 

The suggestors went with their performance tasks and 
suggestions to schools where they each selected two 
teachers with pupils at a level appropriate for the task. 
Each pair of teachers selected was asked to undertake a 
performance task, teaching a particular reading skill to a 
group of six or eight pupils. The pupils were to be ran- 
domly chosen from those present and named on the class 
register. Each member of each pair of teachers was given 
30 minutes to design her own individual lesson, adhering 
to the same instructional objective as her peer. The sug- 
gestor observed the 15-minute lessons which ensued and 
then administered post-tests to the pupils in order to assess 
the effects of the instruction. 

Later a coin was tossed to decide 
pair would receive suggestions before te 
group of children the same reading skill. This exper- 
imental teacher was asked to redesign her original lesson 


which teacher in each 
aching another 


to include the suggestions; the control teacher was sked 
to teach the lesson a second time to different pupils “in 


; you think best." Both teachers in ea h of the 
for their prepar- 
lomly 


whatever wa 
24 pairs were again allowed 30 minutes 
ation. Children for the second lessons were ranc 
drawn from those present in the rooms who had not | 

been taught before. Lesson observations and post-testing 


were conducted as in the first trial. 


Analysis 
It is recalled that each group of two teachers taught to 
a different objective. The high-achieving teacher in each 
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€ to a peer. 
these unsuccessful 


A chi-square test with Yates? Correction was applied 
inasmuch as the categories of improved or not improved 


teachers were based. on different scores taken from inde- 
pendent samples of subjects, 


Results 


The overall results of the signifi 
of teacher Suggestions upon the j 
unsuccessful teachers are summa 


cance test for the effect 
mproved performance of 
rized in Table ]. 


Table 1.-Summary of the S; 
Teacher Suggesti 
Teachers 


ignificance Test for the Effect of 
Performance of Unsuccessful 


ons upon the 


Improvement No Improvement 
With suggestions 


Without suggestions 


x? = 4.04 


p «.05 


All teachers with suggestions im 
first performance; slightly more than half of the teachers 
without suggestions did better. It is Interesting to note 
that 58% of the teachers with suggestions exceeded their 
high-performing peers on the second lesson, while 
25% of the teachers without sugge: 
50. 


proved relative to their 


only 
stions were able to do 
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Discussion 


The results imply that the suggestions of scien 
helpful to less successful teachers. It is difficult i A 
the effect of suggestions upon high-achieving o 
This is so because the high achievers already are - e y 
be using many of the suggested principles. Also, ol " 
possibility of improvement is more remote when one is 
already excelling. . 

The conditions by which teacher suggestions were 
formulated and given in this study may be regarded " x 
atypical. Teachers seldom have the time to analyze un 
fully what is involved in the teaching ofa particular A 
jective and to design ways to maximize its attainmen 3 
Further, few teachers set teaching tasks for pei" " 
and give specific directions for teaching these tasks. Ho 


ever, the study should not be discounted for such irreg- 
ularity. 


Staff development mi 


ght be enhanced by encouraging 
teachers to study Syste 


matically the instructional tasks 
that they and their peers believe to be important. The 
idea of teachers attempting to validate the suggestions 
of peers could give rise to more alternative modes of 
teaching and, thereby, serve more pupils effectively. If i 
improvement can come when teachers follow suggestions 
for teaching involving an imposed task, a task which one 
oes not necessarily regard as crucial, imagine what the 
results might be when suggestions are directed at those 
objectives considered vital by the teachers. 
ere are at least two factors that ke 


peers from helping the less successful te 
Suggestions themsel 


that they canno 


ep suggestions by 
acher. One, the 


teaching bo 


th more effective and a 
ork, 


alth, Education and Welfare, “Right to 
Read Program,” Federal Register, 39, 161: 29929-29931, 
August 1974. 


2. Dreeben, Robert, “The School as a Workplace,” Second 
Handbook of Research on Teaching, Rand McNally & Co., 

Chicago, 1973, Pp. 450-471, 

Lewis, Arthur J.; 


and Miel, Alice 
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Wadsworth Publishi 
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USING AN ACADEMIC PEER INTERACTION 
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ABSTRACT 


An academic peer interaction contingency was in 
disturbed childr 
were recorded during multiple baseline, 
effect and a significant category by time interaction, 
categories as a function of contingency introduction. 
teacher, and hypothesized decreases in student academic wor 


enduring academically constructive changes in interpersonal interaction within the classroom occurret 


introduction. 


TOKEN REINFORCEMENT PROG RAMS have been 
used in recent years to increase academically appropriate 
behaviors in a wide variety of subject populations, includ- 
ing emotionally disturbed children (6). The majority of 
such programs have had as a primary goal the decrease of 
disruptive classroom behavior (5, 9). In these programs 
children typically earn tokens by engaging in academically 
appropriate behaviors such as remaining m their seats, 
hands, and doing work correctly (2, 5). 
fact that emotionally disturbed children ex- 
behavioral difficulty related to 
tion in the classroom setting, few 
token economy programs with this subject population have 
emphasized peer academic interactive categories of ap- 
propriate behavior. The few studies that have focused on 
the peer interaction of emotionally disturbed children wit h- 
in the classroom (1, 7) have generally been concerned with 
individual children and their relation to a group of non- 
problem children in the regular classroom. In addition, 
such studies have usually employed indirect methods of 
reinforcing peer interaction ng the attractive- 


by increasir 

ness of the target child 4). l 

The pakene s d previous token Sn x 
research by modifying an ongoing token economy pun 
with emotionally disturbed children. In addition to the 
usual goal of increasing individual academically NES 
behavior, the modified program reported in the pre n 
study sought to increase systematically positive peer 


raising their 

Despite the 
perience considerable 
academic peer interac 


troduced into an ongoing token economy program in a class of five emotionally 
en with minimal interpersonal skills. Seven behavioral categories of academically relevant and irrelevant behaviors 
contingency, and follow-up observation periods. An ANOVA revealed a significant category 
indicating significant changes in the distribution of student behavior across 
Hypothesized increases in student academic cooperation with peers and 

k alone occurred at statistically significant levels. It was concluded that 


d as the result of contingency 


academic interaction within the classroom. Since academ- 
ically relevant behaviors constituted the target behaviors 

in the present study, it was hypothesized that students 
would show a significant increase in academic cooperation 
with both students and teacher. In addition, academic work 
alone was expected to decrease significantly. 

Although additional behavioral categories were observed 
(e. g., "negative behavior,” “other behavior") in order to 
provide a complete picture of each S's behavior during the 
course of the study, these dealt with non-academic be- 
haviors. While change in these categories was obviously 
anticipated since the category system was exhaustive in 
terms of each S's behavioral repertoire, it was not possible 
to anticipate the specific ways in which these categories 
would be affected by the manipulation. Consequently, 
no specific directional hypotheses were made for the non- 


academic categories. 


Method 


Subjects 


The Ss were five black children of lowe 


status in a special education class for the emotionally 
IQ scores ranged from 


disturbed. Recent W ISC Full Scale 
70 to 90. The group consisted of four males, ages 6, 10, 10, 
and 11, and one 8-year-old female. Data from an additional 
male class member, age 6, is not reported since he at- 


tended class in frequently. 


r socioeconomic 
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On the basis of modal behavioral characteristics, the 
students were divided into two groups: Group 1 contained 
one female and the youngest male, who both were with- 
drawn and rejected by all class members; Group 2, con- 
sisting of the older students, contained three males who 
were outgoing and socially accepted by all class members. 

All students had been class members for at least five months 
prior to the present study. 


Procedure 


The Ss became involved in a token economy system 
upon admittance to the class, Consequently, all Ss had at 
least five months' experience with an ongoing token econ- 
omy system in which points were earned for a variety of 
behaviors prior to the program modification described 
below. In the ongoing token economy program Ss could 
earn 1 point by maintaining appropriate behavior for each 

of 7 thirty-minute periods during the school day. 
Additional points could be earned during 
period by completing an assignment, play 
listening to a record, watching a fi 
art project, or completing other a 
activities, The Ss usually earned a 
for an average day. 

Ss had the option of accum 
time or spending them at the 
up tokens such as toys, candy 
were available to be purchased at the Ss’ choice, at ex- 
change rates which varied from 3 points (permitting daily 
redemption) to 500 points (requiring accumulation of 

points over several days or weeks), 

Since the manipulation describe 
was initiated as a treatment innova 
plied classroom Setting, a within-subject multiple AB 

design with follow-up observations was employed (6), 

Once the effective behavioral changes described below 

appeared as a result of introducing a peer academic inter- 
action contingency into the ongoing token economy pro- 
gram, it was not feasible to return to Precontingency con. 
ditions in this particular classroom setting. 

For the present study a peer academic interaction con- 
tingency was introduced following a baseline Observa- 
tion period. All Ss were told that if they worked with a 
person who was not in their group they would obtain an 
extra point for each activity completed with that person. 
Thus, if a unit of work was completed with a child in 
the other group, 2 points were earned instead of the usual 
l point. To earn the extra pointa Shad to be engaged in 
an active working partnership with a member of the other 
group, rather than merely sitting with each other. 

The Ss’ behaviors were recorded during baseline, 
contingency, and follow-up observation periods by an ob- 
server in the classroom. The observer was familiar to the 
children and had frequently been present in the classroom 
as a nonparticipant in class activities prior to «Us * 
during the study. All observation periods were fifteen min- 


any thirty-minute 
inga game, 
Imstrip, completing an 
Ppropriate classroom 
minimum of 30 points 


ulating their points over 
end of the school day. Back- 
> games, and other objects 


d in the present study 
tion in an ongoing ap- 


utes long. The dependent measure consisted of the m 
percentage of time that each S was engaged in each o ] 
seven behavioral categories during multiple baseline, con i 
tingency and follow-up observation periods. 'The rise 
categories, which were designed to be mutually exclusive, 
were as follows: 


1. Academic work alone 

2. Cooperative academic interaction with another 

student 

Cooperative academic interaction with teacher - 

4. Cooperative non-academic interaction with another 

student . 

Cooperative non-academic interaction with teacher 

Negative interaction (fighting, verbal abuse of 

another, etc.) 

7. Other behaviors ( e. g., sitting alone while not 
engaged in academic work or interacting with 
others; aimlessly wandering around the room) 


e 


Pm 


The first three categories were defined as having 
academic material on the desk for thirty or more seconds 
during each minute of each fifteen-minute observation 
period. The student was required to be actively involved 
with the materials (e, E» writing, turning pages) for the 
categories to be scored 

The next three categories were defined as any type 
of verbal or physical exchange with others while not en- 
gaged in an academic activity. 

Pre-testing indicated that the 
used reliably by more than one 
account for all of a S 
periods, 

Data were recor 
of 55 fifteen-min 


se categories could be 
observer and could 
’s behavior during observation 


d. Baseline data, consisting of 17 random 
tions was gathered during the first 
three days prior to Contingency introduction. Contin- 
gency data, Consisting of 38 random fifteen-minute ob- 
servations, were gathered during the remaining twelve 
days, Follow-up consisted of 5 random fifteen-minute 
Observations obtained more than one month later and 
distributed over two days. Practical considerations related 
to the school’s schedule necessitated the use of unequal 
numbers of observations during each phase of the study. 


Results 


The mean percentage of time spent by Ss in the seven 
behavioral Categories during baseline, contingency, and 
follow-up is presented in Table 1. ; f 

A least-squares analysis of variance with the effects ie 
Ss and time absorbed (3) was performed on these data as 
indicated in Table 2. The analysis revealed a significant 05) 
main effect for category (F = 8.261; df = 6, 72; p E: A d 
indicating that student behavior was unevenly distribute 
across categories, as would be expected. Of greater im- 
Portance, a significant category X time interaction (F = 


Der a a a 


X 
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3.996; df = 12, 72; p < .01) suggested that the distribution 
of students? behavior across categories changed significantly 
as a function of baseline, contingency, and follow-up. 

Specific comparisons of category X time means using 
Tukey’s procedure (8) demonstrated that academic coop- 
eration with the teacher was significantly greater from 
baseline to contingency and from contingency to follow- 
up (p <.05). In addition, academic cooperation with 
students, the behavior of major interest in the present 
study, showed a substantial increase (p < .10) from base- 
line to contingency, and no significant decline from con- 
tingency to follow-up. Academic work alone decreased 
substantially (p < .10) from baseline to follow-up. 

It can be seen from Table 1 that although changes in 
other behavioral categories over time failed to meet a 
rigid criterion of statistical significance, changes did occur 
in the appropriate directions. For example, it is particularly 
noteworthy that the three Ss who engaged most frequently 


in academically irrelevant behavior (other behavior) during 


baseline showed marked decreases for this behavioral cat- 
egory during contingency and follow-up. 


Table 2.-ANOVA Summary Table for Mean Percentage of Time 
Spent by Subjects in Seven Behavioral Categories 
during Baseline, Contingency, and Follow-up 


SOURCE 


Category 6 7881.962 1313.660 8.261* 


Category X time | 12 7625.867 635.489 3.996** 


11449.600 159.022 


Error 


*p <.0005 
**p«.01 


I The main effects of time and subjects have been absorbed, following 
Harvey (3). 


Table 1.—Mean Percentage of Time Spent by Subject in Category 


Cooperative 
Academic 
Interaction 
with Student 


Academic 
Work Alone 


Cooperative 
Non-Academic 
Interaction 
with Teacher 


r for a category X time me 
for a category mean (N71 
fora time mean (N = 35)= 


Standard erro 
Standard error 
Standard error 


Negative 
Interaction 


Cooperative 
Non-Academic 
Interaction 
with Student 


Cooperative 
Academic 
Interaction 
with Teacher 


Other 
Behaviors 


B = Baseline 
C = Contingency 
F = Follow-up 


an (N = 5) = 5.64 
5)=3.25 
2.13 
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Discussion and Summary 


The results of the present study suggest that peer inter- 
actions in a class of five emotionally disturbed children can 
be successfully manipulated in positive and academically 
relevant directions with the addition of peer-oriented 
contingencies to a more traditional ongoing token economy 
program. The strength of such manipulations is attested 
to by the fact that statistical significance was achieved or 
approached for relevant behavior changes, a more rigid 
criterion than is commonly applied to behavior modifica- 

tion research. Further, it is evident that behavioral changes, 
once established, persisted over time in the present study. 
It should be pointed out that both the classroom and the 
subject characteristics in the present study were relativ. 
unique: Class size was relatively small, and the class mem- 
bers were probably more homogeneous with respect to 
race, socioeconomic status, IQ, and other factors than 
would be expected for many comparable special education 
classes, but the group was more heterogeneous with re- 
spect to age. 
Results showed the efficacy of utilizing 
modification principles in bringing about s 
in the behaviors of a class of fi 
children. It is clear from the p 
program that the use of behay 
to manipulate peer interactio 
behaviors, is desirable. This L 
siderable potential for use with students with behavior 
problems or other emotional difficulties since interaction 
is often a major problem for these childr 


ely 


behavior 
pecific changes 
ve emotionally disturbed 
ositive outcome of this 

lor modification techniques 
n, in addition to other target 
echnique appears to have con- 


en. 
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THE EFFECTS OF FRUSTRATION ON THE FIGURAL 
CREATIVE THINKING OF FIFTH GRADE STUDENTS 


K. BRADLEY FROST 
University of Georgia 


ABSTRACT 


i ed of being prematurel 
tunity of receiving a reward upon its completion. The non-frustratin, 
rewarded. Changes in fluency, flexibility, originality, 


i I Forms A and B, were investigated. On all co 
ier ihe frustrating conditions. Females scored hi 
interaction effect found was the sex X treatment i 


ALTHOUGH THE VERY NATURE of our existence 
precludes the possibility of a frustration-free environment, 
the effects of frustration on men tal functioning should 
be better understood so as to minimize its harmful in- 
fluences and maximize its benefits. The scope of this ex- 


were indi 
rmine whi 


vidually tested twice, once under frustrating conditions 
ether thei i 


eir creative expression would be affected by 
y halted fr. i 


iginality 
ll four ci 
ration. 


ploratory study is to study the effects of frustration ne lren- 
the creative thinking of normal, healthy fifth grade chile 

A considerable body of research (2) indicates that 
frustration and other stresses may cause cither improve 
ments or decrements in human performances depending 


— O 
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upon the intensity and duration of the frustration and the 
status of the person experiencing frustration. An earlier 
study by Turner (5) involving 55 emotionally disturbed 
children ranging in age from 6 to 18 years showed that 
even moderate frustration resulted in decrements in 
creative functioning as measured by the Torrance Tests of 
Creative Thinking, both figural and verbal. Many people 
believe, however, that mild frustration facilitates creative 
thinking and may even be necessary for it to occur. 

The major purpose of the present study was to determine 
whether moderate frustration, such as that used by Turner, 
when applied to normal, healthy children would result in 
decrements of performance in figural creative thinking. 
Secondary purposes were to determine the interaction ef- 
fects of level of academic a! 
frustration on figural creative 
that frustration would increase 
formance of healthy fifth grade 
not cause a decrement in performance. 
hypothesized that level of academic achievement and sex 
would not influence the effects of frustration on figural 


chievement and sex with 
thinking. It was hypothesized 
the variability of the per- 
children but that it would 
Further, it was 


creative thinking. 


Procedures 

The statistical design of the study is a simple re- 
peated measures design with one within-subject variable 
(frustration) and two between-subject variables (sex and 
academic achievement level). The Ss, in essence, acted as 
their own con trols. Each S was tested twice, once under 
experimen tal conditions and once under control condi- 
tions. 

The Ss were drawn randomly from the fifth grade pop- 
ation ofa rural intermediate school in Northeast Georgia. 
mostly from middle-class families. Eight Ss, 

s ale, were selected randomly from 
nt levels established in the 


ul 
The Ss were 
four male and four fem 
cach of the three achieveme 
schoc jl . 
Experimental Treatment 

ntal procedures were as follows: , 
Ss were located by the researcher during 
the school day orted by him toa spare classroom 
where treatment was presented individually. During this 
time the researcher explained to the S that he had been 
chosen to help in an experiment for the University of 


eso desired. 


The experime 
Step 1.—The 


and esc 


Georgia if h iv 
Ste 2,—In the classroom was à table on top of which 

ea d with various types of candy bars. The 

box and given the following instruc- 

sein I want you to relax. This not 

" jur imagination. But 


sat a large box fille 
S was scated by the 
tions: Before we he 
a test. I want to see ho 
before we do that, Į want lo give youa 
candy bar. All you have to do is complete 
puzzle in the time I allow and you may p n 
any candy bar in the box- Are you ready? Go- 


w vou can use y< 
L chance to win a 


this word search 
your choice of 


At the end of two minutes, before the S could complete 
the puzzle but after he was totally involved, the researcher 
called time. At this point, the researcher closed the box of 
candy and removed it from the table. 

Step 3.—Immediately following the frustrating experience, 
the Torrance Test of Creative Thinking (TTCT) was ad- 
ministered to each S. The standard directions provided in 
the test booklets were read to the S, with further expla- 
nation if requested by the S. 

Step 4.— Approximately two weeks later the Ss were 
located again and escorted to the same classroom. 

Step 5.—Each S was presented with the box of candy 
bars and a similar word search puzzle to complete. This 
time the S was allowed to complete the puzzle and select 
his or her choice of candy. 

Step 6.— Immediately following the non-frustrating ex- 
perience, the TTCT was again administered to each S using 
the standard directions. 

The sequence in which Steps 2 and 5 were administered 
to each S was randomly determined. 


Word Search Puzzle 


Two word search puzzles were used as a vehicle to induce 
frustration in the Ss. Puzzles of this type are easily found in 
published books of crossword and other word puzzles. 
These two puzzles were constructed by one of the Ss 
teachers to insure the students’ familiarity with the words. 
The object of the puzzle was to locate and circle all given 
words. This type of spelling exercise was familiar to all the 
Ss since they were regularly assigned such a puzzle to com- 
plete. It is primarily because of the Ss’ knowledge of these 
puzzles and apparent ease in completing them that the 
researcher chose this instrument in hopes of eliciting true 
frustration by prematurely halting a familiar task. To de- 
termine the time limit to allow before halting the Ss short 
of finishing each puzzle, the researcher worked each puzzle 
himself and achieved a minimum completion time of four 
minutes. A time limit of two minutes, or half the time 
needed for the researcher to finish each puzzle, was arbi- 
trarily selected. This would hopefully allow enough time 
for the S to be fully involved in the task but not allow any 
S to finish. 


Torrance Tests of Creative Thinking 
For this study the TTCT Figural Forms A and B were 
The researcher chose 


used to measure creative expression. 
for the following 


the Figural instead of the Verbal, or both, 
administered individually, the 


Verbal; the subjects have an 
was facilitated: 


reasons: Since the test was 
Figural took less time than the 
art class each day, so figural expression 
reading inadequacies which could have been reflected in the 
Verbal (especially in the lower achievement group) were 
minimized by using the F igural form. 

The two alternate forms of the Figural A and B were 
used to provide control against interaction of the tests 
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si second testing period ranged from one to three 

E an seer i two weeks, after the first. Figural 
Form A was used for the first testing situation and Figural 
Form B was used for the second. In the TTCT Norms- 
Technical Manual (4), Torrance queres disais le the 

$ are equivalent. Abundant evidence supporting 

ebur d reliability of the TTCT can also be 
found in the TTCT Norms-Technical Manual. 


Scoring 


The tests were scored by the researcher, who obtained 
an interscorer reliability coefficient of better than .95 
for each component with a trained Scorer who is respon- 
sible for supervising scores for the TTCT on behalf of the 


Personnel Press Scoring Service, 


The TTCT produces a com 


vant responses, Flex- 
gories of responses 
s fluency. Originality 


cated by each object, other than the 
is no maximum score for elab 

To insure equivalency of the two 
the raw scores Were converted to sta 


minimum basic idea. 
oration. 

forms of the TTGE, 
ndard or t-scores, 


Results and Conclusions 


The researcher computed a univariate ANOVA table 


for each of the four components to determine statistical 
significance set at the .05 level. 


Although there was no statistic 
effect, the results in Table ] 
to increase creative behavior 


originality. The differences in the standard deviati 


under frustrating conditions, 
the researcher’s hypothesis. 
Results in Table 2 point out that the 
significant effect found was the sex x tr 
action for elaboration. This may be interpreted as meaning 
that the effects of frustration on this component depend 
hether the S is male or female. The difference between 
a imental and control means, presented in Table 
: nor males, whereas the difference for females 
^i gi 18 The S XT interaction means point to the idea 
5 peers MCA less elaboration under frustrating cir- 
ace Las that females produce more elaboration 
— s ated. This interpretation seems compatible 
n ONE encouragement of “embroidering” be- 
ours ped > Pone 
is in females and m d EN 
the idea that under stress humans ss e e exem 
manner familiar to them is in accord wi . 


only statistically 
catment inter- 


The results for originality produced no significant 
difference; however, the S X TT interaction did closely 
approach the .05 significance level. . "A 

The female Ss scored higher than the males in all fo 
Components. This is contrary to Gallagher’s report Q) a 
that boys function better and achieve higher scores tha 
girls on tasks requiring non-verbal performance. The 
higher female scores are also surprising in light of One 
society’s traditional tendency to train males generally to 
produce more and better than females. . 

The academic achievement levels produced interesting 
data, with the average group scoring highest in all com- ' 
ponents except elaboration, where the high achtevemani i 
group scored only 2.0 points better, The low achievemen 
&oup scored the lowest in all four c 


omponents. 
So it see 


ms that, contrary to the secondary hypothesis; 
sex does influence the effects of frustration on figural 
creative thinking in normal fifth grade children. 

Although no definite conclusions can be made from 
this study, the most obvious 
one of generating hypotheses involving the effects of 
frustration on Creativity. One such hypothesis may 
involve the different effect that frustration seems to have 
on males and females. Since the females in this study 
were generally more creative, particularly under frustrat- 
ing circumstances, a look into the possibility of academic 
environments containing different levels of frustration for 
males and females ig plausible. 

Another hypothesis might deal with the possibility . 
that certain levels of frustration could be instrumental in 


increasing creative performance. Since a result of this 


study was a trend for higher Creativity scores to be pro- 
duced by induced f, 


rustration, further investigation may re- 
veal the existence of such an optimal level of frustration 
Which could possibly be utilized to maintain an optimal 
in the classroom. 


implicati » Seen as 
implication may be seen a 


à measurement of creative abilities 
in addition to academic achievement level as a criterion 
ing or grouping students might prove to 
imal educational development. 

In summary, then, the major implication of this study 
Seems to be one of stimulating hypotheses about the effects 
of frustration on creativity. However, the researcher rec- 
Opnizes that future investigations will be necessary in 
order for any of the above hypotheses to be substantiated. 
ecognizes the existence of certain 
udy, specifically, those limitations 
igural subtests of the TTCT since two al- 
ternate forms Were used; those caused by this research 
design Since the degree of frustration is not measured; those 
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Table 1.—Means and Standard Deviations for Fluency, Flexibility, Originality, and Elaboration 


Fluency 


x 


Treatment: 

1. Frustration 42.75 10.64 44.21 
2. Non-frustration | 41.12 9.24 43.75 
Sex: 

1. Male 41.42 12.22 43.29 
2. Female 42.46 7.06 44.67 
Academic achieve- 

ment level: 

1. High 7.88 44.13 
2. Average 


3. Low 


Table 2.—F-Ratios for Fluency, 


Subject Variables 


Academic achievement 146 
level (A) 

Sex(S) 0.07 
Treatment(T) 1.01 
AXS 0.69 
AXT 0.22 
SXT 


AXSXT 


Tabled F = 4.41 


*p <.05 


by Treatment, Sex, and Academic Achievement Level 


Flexibility 


Flexibility, Originality, and Elaboration 


Fluency 


Originality Elaboration 

X SD 
10.01 60.50 19.59 57:33 11.72 
9.66 64.67 10.50 51.25 12.30 
10.74 60.95 21.14 56.38 10.66 
8.80 64.21 19.00 58.21 13.16 
8.47 64.88 19.52 60.88 11.87 
69.81 2245 58.81 12.00 
53.06 14.23 52.19 10.66 


Flexibility Originality Elaboration 
2.00 2.88 1.47 
0.16 0.31 0.18 
0.06 0.76 0.00 
1.86 1.23 0.70 
0.38 2.80 0.15 


0.06 3.87 4.56* 
0.01 0.69 


Table 3.—Interaction Means of Treatment X Sex 


Sex 


Male 


Female 


caused by the restricted pop 

a small sample: and also those ca 

the Ss being affected by previous t 
FOOTNOTE 


M.A. thesis Wi 
niversity © 


used by the possibility 


reatments. 


i ; ritten under the 
1. This paper 1S based on an Y Georgia, 1913. 


direction of Dr. E. Paul Torrance, U 
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UNIQUE MULTIPLE LINEAR REGRESSION 
PROBLEMS FOR EACH STUDENT 4 


GEORGE E. COUNTS — 
Southeast Missouri State University 
Cape Girardeau, Missouri 


ABSTRACT 


tween two variables. 

Problems of these two types have been used by South- 
east Missouri State University students for several years. 
Other types of problems have been simulated with lim. 
ited effort and also utilized, Multiple correlation prob- 
lems, however, proved to be both intriguing and aggravat- 
ing. The objective is to provide each student with unique 
data which are random samples from a set of variables 
with a known multiple correlation coefficient, One ap- 
proach is to control the relationships between all of the 
variables. In the process of maintaining relationships it is 
critical that means and standard deviations are also con- 
trolled. The resulting simulation problem is to define re- 
lationships between normally distributed variables in such 
a way as to provide an expected intercorrelation matrix 
as a limit when the sample size approaches infinity. The 
number of variables was arbitrarily limited to ten during 
this initial effort. 


Procedure 

The process of controlling interrelationships between 
variables depends upon use of appropriate formulas. Al- 
though some readers will be more interested in the results 
rather than how such results are achieved, an overview of 
the process may be helpful. Three basic formulas were fin- 


The purpose of this report is to describe a process for creating unique multiple linear regression problems for each student. Three 
mulas were utilized to define intercorrelations i i 


ally used in the simulation process. The first formula is a 
standard score regression equation for predicting z-scores 
on variable I from variable 7-1 through variable 1. 


which haye already been included. The index I is at most ten 
and at least two, 

The set of beta weights (I > 2) in each prediction equa- 
tion (nine equations for ten variables) was calculated by 
calling a subroutine (MINV) provided in the Scientific 
Subroutine Package (SSP) for IBM 360 Model 40 users. 

The subroutine requires values from the appropriate (ex- 
pected) intercorrelation matrix and returns values used to | 
calculate the beta weights. This subroutine is one of a set 

which was written to provide a multiple linear regression 

analysis, 

A second formula was used to determine the variance 
of the predicted values in Formula 1. The formula for the 
Variance “of a composite of any number of weighted com- 

ponents” (2:421) 


2 = Fi 29 

O7 Zw2g? + ` ! 

ws i 9j 2xn w, o, UIS 
where i <j 


can be simplified as long as the variables on the right m 
of the regression equation have standard deviations equa al 
to one. In this context the standard deviations are all equ 
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to one because variables are in standard score form. This 
revised formula is stated below: 


2 
ws 


zs pd 
g. = Zw; + 22r; W; Wj (2) 


where i <j 


For this application, the weights (w, and w;) are beta 
weights. This variance represents the variance which can 
be predicted from other variables. 

Finally, the amount of unique error variance (o2 ) for 
Variable 7 must be calculated. As the total variance for z- 
scores is one and predictable variance (02. in Formula 2) 
is less than or equal to the total variance, error variance is 
equal to one minus predictable variance. 

o2=1—- 5s (3) 

For I = 2 the beta value is the relationship between Vari- 
ables 1 and 2 (0,5 ) and error variance is equal to one minus 
the square of the correlation coefficient (o2 2:1- p2,). For 
I> 2, beta weights and error variance are more difficult to 
determine, but, as indicated previously, the MINV subrou- 
tine provides beta weights, and Formulas 2 and 3 provide 


the unique error variance. 
Given these formulas, and defining z, as the ith random 
i 


sample from a unit normal distribution (4, = 0,0, = 1), 
e e 


the following formulas define interrelationships between 


variables: 
By = ie 
2 5 z = z PE pes E zr 

zB, 7 sal o P, ^" ai 0157, * V1 7 Pia ze) 

1 

a 

z= + z tyl-a z 
#3 b, 72 um 1 we, 63 


ZR 14 1 2 f4 
ia 
ap ^ & z +B, z, +B, Z, YL- Tus, e 
75 B, 74 bs, 3 2. 3 141 Wig “s 
Zz = mu "ETIN C LA ze + Bs #5 
E io 9, ^9 B, 8 Te 7 64 5 
gd z +B, 5 +B, oz 
T bai 4 5, Q^ ra Clg 
2 z 
+ V1—%Qs, “ero 


cover only the first five variables 
be derived in the same 


beta weight for a var- 


The formulas above 
and Variable 10. The other four may 
mplies. the ora 
ach different equation im 


manner. As the notation ! 
iable must be recalculated for ¢ 


which the variable appears. Also, it should be noted that the 
z-score for Variable 1 is set to the first random “error.” 
This is appropriate because all variance in Variable 1 is 
unique variance until other variables are generated. Also, 
z-score notation is appropriate for each variable. Formula 
2, when applied to the weighted components, simplifies to 
o2 t (= 025 ) or one. 

Substantial time is required to incorporate these form- 
ulas into computer programs, to remove errors from each 
program, and to verify that output is acceptable.’ 


Results 


In testing a computer program based upon the pre- 
ceding formulas, eleven examples were drawn from Guil- 
ford’s text (2:404) and are presented below in Table 1. 
An attempt was made to simulate sampling from popula- 
tions with these interrelationships. 


Table 1.—Correlations between Variables 1, 2, and 3 and the 


Multiple Correlation Coefficient for 11 Examples 


EXAMPLE 


1 4 4 -0 57 
2 4 4 E 48 
3 4 4 3 41 
4 4 2 .0 445 
5 4 DÀ 4 -40 
6 4 à 2 .54 
7 4 0 0 40 
8 4 0 4 : 

9 4 0 49 
10 4 2 -4 
11 4 -4 -4 


In each simulation trial 2000 values were drawn for 
each variable. The intention was to limit sampling error 
by this relatively large sample size. The GAUSS and RANDU 
subroutines (from the IBM 360 SSP) were used to allow 
sampling from normal distributions. 

Table 2 contains statistical results for each example. 
The first three columns show sample means for each var- 
iable. All means are approximately zero as expected. 

The next three report observed sample standard deviations 
and are approximately equal to one. (The computer pro- 
gram also allows translation of z-scores into raw scores 
with specified means and/or standard deviations.) The 
final three columns are empirical estimates of the popula- 
tion values in Table 1. For each example the departures 
from expected values were judged to be sampling error. 

To support this opinion, each member of each pair 
(sample value and corresponding parameter value) was 
subjected to Fischer’s Z transformation (4:186). The 
transformed parameter value minus the transformed sam- 

Je value was then divided by the standard error (QA/N-3 


= 1997 = .022). The probability (p) of securing the 
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ble 2.—Means, Standard Deviations, and Pearson Correlation Coefficients for Variables 1, 2, and 3 by 
Table 2.— 
Example from Simulation Runs 


largest difference (Example 5, 712 7 449, p, , = 400) 
or a more extreme difference in ej 
is .007. The second most unlikely 
wasr 3 = — -366 (p = .072). For t 
only one was classified as more im: 
of .05 or .01. 

Although the most extreme 
seems rather improbable, it is t 


ther direction by chance 
value (Example 4) 

he 33 sample values, 
probable than alpha levels 


case (considered in isolation) 
he only one which demon- 


that the set of sample correlations is significantly differ- 
ent from expected values, This process could be repeated 
with an even larger sample size if necessary. 

The next problem is to relate the 
tiple correlational setting. The follo 
(2:404) could have been used for e 
amples in Table 2: 


se results to the mul- 
wing correlation formula 
ach of the eleven ex. 


However, as indicated previously and as implied by this 
formula, the multiple correlation coefficient is completely 
dependent upon the relationships between Variables 1,2, 
and 3. If each sample value ¢, 417,35 8ndr, )is si 
proaching the corresponding parameter value Pia Pgs 
andp,, ) asa limit, then the squared sample multiple 


correlation coefficient (R2 , , ) also is approaching its 


imit. | 
“oe an additional test was made to determine the 
uter OAT response to an intercorrelation which is 
comp à 
impossible. For each member of the set (o, m. 
alue of — 1.00 was assigned. The matrix was 
identificd as being singular by the MINV subroutine, and no 
correlations were attempted. 


13:054) 


Conclusions 


The following conclusions seem to be valid: 

l. The flexibility of the program is substantial; no 
difficulty is expected in the generation of statistics prob- 
lem for students. " 

2. The experience of computer simulation of such ed 
ables may provide additional appreciation for the complex 
ity and simplicity of the statistical method. The develop- 
ment of a computer Program using these formulas is not 


too difficult. Advanced students might prefer to start from 
the problem definition stage, 


3. In some situations 


it may not be necessary to know 
the true 


or expected relationships. If interval-scaled scores 
ormal distribution are converted to scales of leat 
Precision (and theory does not provide exact relationships) 
à comparable “sample” with an even larger “sample” size 
can be used as a population. This population may or may 
not include the Student's sample. a 

4. With limited effort, the process could be modified 
to provide sampling models Which are alternatives to 
sampling from unrelated variables—those for which all E. 
weights are zero, Given an hypothesized intercorrelation 
matrix, empirical estimation of the probability of secur- 
inga multiple correlation coefficient less than (or ames 
than) the observed coefficient is possible. Rejection ge 
null hypothesis at an improbable value plus failure to 
reject hypothesized values at a highly probable value ive. 
(using Simulation methods) are closer to the real io 
Simulation trials, however, may be impractical with larg 
samples, istics 

5. In addition to the potential use in teaching Ld 
Courses which include multiple linear regression tac 
this process could also be used to generate “data” for least 
analysis by a student in a research methods course. ned 
Some courses of this ly pe provide little or no mu 
for the student to demonstrate or improve his compet 
in analyzing such data and reporting his findings 
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die conclusions. Dissertation committee members might REFERENCES 
e reassured by the knowledge that a particular research 
plan, Vased. upon multiple linear regression, Was cansied 1. Counts, G. E., “The Creation and Solution of Unique Statistical 
Aet ital à gr , Problems for Each Student,” Journal of Experimental Edu 
i ogica conclusion, that the results were well tion, 37: 17 - 20, Spring 1969. j I 
organized and accurate, that conclusions drawn were 2. Guilford, J. P., Fundamental Statistics for Psychology and 
Education, McGraw-Hill, New York, 1965. 


reasonable, and that the nature of the “population” 
on" has Cerli 
pop 3. Kerlinger, F. N.; and Pedhazur, E. J., Multiple Regression in 
Behavioral Research, Holt, Rinehart and Winston, New York, 


been shared with the investigator. 
1973. 


4. Klugh, H. E., Statistics: The Essentials for Research, John 
FOOTNOTE Wiley, New York, 1974. 


1. Program listings for each program used in this process may 


be obtained from the author, Dr. George E. Counts, Southeast 
Missouri State University, Cape Girardeau, Missouri, 63701. 
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ABSTRACT 


e study of 180 children from birth to age seven is reported. The criteria at age seven was knowledge of occupations as 
en child was 


A prospectiv s k 
indicated by à pictorial test. The predictors were: seven aspects of the child and home at birth’s, maternal 1Q as tested wh 
tified description of the potential stimulating characteristics of the home at four years; and a measure of parental 
ibjected to a multivariate regression analysis. Social class data were the 


three years, a quan p 
attitudes to schooling when child was age 


prime source of criterion variance. 


five. The data were su 


cupational in formation. From a body of research and 
writing (1, 2, 19, 21), it is clear that school counselors’ 
effectiveness in vocational counseling is greatly influenced 
by antecedent concepts of work. These concepts are forme 
ricular aspects of well before the time for vocational selection emerges. 
considerable. Although information about formation of work concepts 
is incomplete (1, 13), the work of Gribbons & Lohnes (11 
xert an influence l so indicates that children in the eighth grade have attained 
; career choice should begin, vocational self-concepts which show a good deal of stabilit 
nt schools wish to encour- in subsequent y ears. Recent work on the problem by 
Wehrly (22) indicates that early formation of work con- 
lated to parental occupation, à 
ate. The problem is more su 


iD in this investigation 18 à 


THE PROBLEM STUDIE 
on of home and family var- 


contributi 
and type of knowledge of occupations 
le. The importance of such 


description of the 
iables to the degree 
held by children in first grac 
information to people working on cur 
in the elementary school is 
Curricula to increase the quality of career choices by young 
ong before the years 


vocational choice 


persons need to € 
of adolescence. Presumably, 
like other aspects of developme 
age, in the elementary school. 


The preceding remarks are 
eer development an 


cast in the context of tradi- cepts is not re s common 
sense might indic btle, and the 


d acquisition of oc- 


tional views of car 
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influences are not clear. From this it can be concluded 
that there is a need to explore the complex of early in- 
fluences which determine knowledge of occupations. 
However, there is a rather different intellectual frame- 
work within which the matter of origin of work concepts 
and occupational choice has appeared in the last few 
years. The problem has been most boldly presented by 
Jencks in his 1972 book, Inequality: A Reassessment of the 
Effect of Family and Schooling in America (12). Jencks 
observes that the social problems of inequality of attain- 
ment and status in our society are tied to schooling—an 
observation this writer has documented recently (13), 
and which has been examined in great detail by Duncan 
et al. (7). Jencks goes on to set forth a conclusion based 
on his analysis of the 1966 Coleman Report data: that 
simply raising the level of funding for schools, the con- 
cept of general support implicit in most states" Foundation 
program of assistance, will not do. We are thus left with 
the need to emphasize preparation for vocational choice 
on a rational, curricular basis, This is, of course, 
traditionally held by vocational experts, 
articulated at the highest levels of educa 
To this discourse is adde. 
(12: 256): 


a view 
but it is one now 
tional strategy. 

d another point made by Jencks 


Our research suggests, however, that the character of a 
school’s output depends largely on a single input, namely 
the characteristics of the entering children, Every thing 
else—the school budget, its policies, the characteristics 


of the teachers—is either secondary of completely ir- 
relevant. 


is this central matter of “, , - characteristics of the 
t 

entering children. . ." and their explicit knowle 
arrangement of work occupations 


pursued. The inquiry accordingly 


dge of 
that this inquiry has 
addresses the twin 
problems of describing occupational knowle. 
drawing inferences for the development o 
materials to develop career information, 


dge and 
f curricular 
k 


Method 


General Design 
The general design is a prospective longitudinal inquiry 
with a multivariate analysis of data. Such an inquiry has 
been under way on a population of 1,000 newborns de- 
livered in five St. Louis hospitals in the winter of 1966- 
67. The 1966-67 cohort has been studied in two Portions: 
the winter group selected for this study refers to children 
studied on or very close to the seventh anniversary of 
their births, as opposed Lo a summer group Of children 
studied annually, but six months after their birthdays, 
The children in question were traced through an annual 
process of confirming PPM TERM ni trenis were made 
to test the children individually idi bui using : 
trained and experienced examiners Fs j E race. The 
number of children traced and tested was 284. 


A1 


us 


Variables 


The criterion variable administered at age seven years 
was Fulton's Test of Career Knowledge (9). This is a 30- 
item picture recognition test composed of items from 
the categories of the Dictionary of Occupational Titles s 
(DOT) (6). Specifically, the categories are: DOT # 1—pro- 
fessional, technical, and management occupations (9 
items); DOT # 2—clerical and sales occupations; and 
DOT # 8—structural work occupations (5 items). A 
fourth category, miscellaneous occupations (5 items), is 
not reported here due to its heterogeneity. The test has 
an internal consistency (reliability) of .86, and validity, 
according to Fulton et al. (10), is demonstrated by con- 
formity to three aspects of curricular validity, plus 
conformity to elements of the DOT. 

The predictor variables were selected from a set of 
measures previously gathered during annual Lesting in 
àn attempt to build a picture of family characteristics at 
specific time points in the preschool years. They were 
chosen to shed light on the influences determining career 
knowledge in first grade 
for curriculum deve 
variables are give 


with a view to drawing inferences 
lopment. The predictor and criterion 


n in Table 1, and the following clements . P 
are those which are not self-explanatory : s 


Intelligence score is the 
study child’s mother on the 
(QT), a valid and reliable 


This test was chosen be 


raw score attained by the 
Ammons’ (1) Quick Test, 
vocabulary-type instrument. 
cause it draws on verbal skills 


relevant to the overall purposes of the prospective study, 
and because it could be adminis 


circumstances more 
Maternal educ, 
which is assigned ac. 


tered under adverse home 

easily than most tests of ability. 

ation level is a score from 1 to 5 

cording to years of schooling. It P 
ranges from a low score of 1 for elementary education 

only to a score of 5 for college education. 

Paternal education ley, 
is assigned according to le 
alow score of ] for ele 
uate degree. 


el is a score from 1 to 7 which 
vel of schooling. It ranges from 
mentary education to 7 for a grad- 


. PATE score is the 
to Education scale (18). 


STIM score refers to a quantified description of the 
potential stimulating characteristics of the home, as de- 
veloped by Caldwell (5). 

SES score is 
the socioe 


Score on Medinnus’ Parent Attitude 


a weighted three-factor description of 
Conomic level of the home based on the bread- . 
winner’ eye] of schooling, occupational title, and level of 
+ as developed by McGuire and White (17). 


income 


" | 
Statistical Design 

The analytic technique applied to the data was intended 
to (1) identi fy salient variables by means of the specific 
contribution to crite; 
actions 


" ; 2 pen 
rion variance, and (2) examine y»? 
d : x put à " «e ilti- 
of the variables identified as influential. The mt 
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variate technique employed, the AID-4 interaction regres- 
sion approach, chooses elements of a predictor series which 
meet predetermined criteria for contributing to criterion 
variance. Through a series of heuristic splits of predictors, 
aset of consecutive splits is contrived and extended. The 
splits or branching into subordinate but significant pre- 
dictors creates a tree-like array of predictors in which both 
primacy of contribution to criterion variance and inter- 
actions may be identified (16, 20). 


Results 
Descriptive Findings 

The data analyzed in this report consist of information 
from the developmental history of each study child from 
birth to age seven years. Complete information on all ten 
predictor variables plus criterion variables is reported for 
180 Ss in Table 1. There it will be seen that the Fulton 
test criterion score has been additionally treated to pro- 
vide three additional criterion scores: DOT subgroup 
scores for technical, clerical, and structural occupations, 
bringing the total number of criteria reported and analyzed 
to four. 

It can be seen from Table 1 that the Ss are approx- 
imately balanced by sex (47% male). The racial com- 
position of the study group is 20% black, which approx- 
imates the national proportion. The mean social class 
(SES) score of 51.59 in Table 1 is quite close to the mean 
of 59 (o = 16) observed in 1966-67 for the birth cohort 
(15). The difference of 2 points is in the direction of 


a slightly higher social level. This insignificant trend is 
explained by the problem of tracing and testing children 
from the lowest social strata. This particular problem also 
explains why the inferential study group described here is 
less than the 280 cases tested. The children omitted are 
those for whom there was incomplete information on any 
of the ten predictor variables gathered during the pre- 
ceding seven-year period. The mean level of schooling 
score attained by mothers is 2.93, which is a little less 

than four years of high school, on the average. The intel- 
ligence score mean reported for mothers in Table 1 trans- 
lates into an IQ of 92. Paternal education level is only 
slightly higher than maternal; the mean of 4.19 in Table 

l indicates an average level of schooling just a little beyond 
high school for fathers. The birth order mean value is 2.88, 
indicating that the study children tended to be the third- 
born, having two older brothers and sisters. This fact is 
consistent with the information in Table 1 on the mean 
age at delivery of the mothers. The average mother was 26 
years and four months at the time the child under study 
was delivered and enrolled in the longitudinal study. Again, 
on the average, she took the PATE scale at age 31 years 
(child age 5 years), and had a mean score of 59.02. This 
score is also similar to that observed for several hundred 
mothers on the average. 


Regression Models 


The basic technique of inferential analysis, the AID-4 
method, employs mathematical regression of variables on a 


Table 1.—Description of Ss Used in Multivariate Analyses (N = 180) 


CHILD AGE PREDICTOR/CRITERION xX o 

Delivery Three-factor SES score [McGuire & White (17)] 51.59 15.77 

Delivery Race (%W) 80 

Delivery Paternal education 4.19 1.50 

Delivery Maternal education 2.93 .98 

Delivery Birth order 

Delivery Maternal age 26.31 6.35 

Delivery Sex (76M) 47 

3 years Maternal QT raw score [Ammons & 38.94 5:12 
Ammons (1)] 

4 years STIM score [Caldwell (5)] 34.66 4.53 

5 years PATE score [Medinnus (18)] 59.02 7.01 

7 years Fulton test total score 22.68 3.03 

7 years Technical occupations score (DOT #1) 3.48 98 

7 years Clerical occupations score (DOT #2) 4.25 .86 

7 years Construction occupations score (DOT #8) 3312; 91 
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CRITERION 


Total score 


DOT technical group #1 


DOT clerical & sales group #2 


DOT construction group #8 


criterion. It is helpful to observe the materials listed in 
Table 2. There, the final regression models are given as 
described by the interaction/regression program. It may be 
seen that the R? values are mostly from .18 to .24, with 
the exception of the second criterion, for which the R2 
value is .13. While these values are not high, they are quite 
typical of values observed in other analyses at school 

entry age (15). In three of these cases the complex models 
listed are statistically significant at the .01 level of con- 
fidence, while the others are significant at the .05 and .03 
levels. The five models generated are, accordingly, adequate 
for purposes of further analysis. A more detailed exam- 
ination of the interaction patterns of "trees" generated 
through these regression models is now presented. 

In Figures 1 - 4 the results of applying the interaction 
regression technique to scores obtained by children 
78 months on Fulton's Test of Occupational Know 
are seen. The four figures represe: 
scores. First, there is a total score, followed by Figures 
2-4 repre: nting subscores for three occupational 
subgroups—professional and te 
and structural work—correspo 
categories # 1, 2, and 8. 

In Figure 1, the full score on the Fulton test, five of 
the nine elements in the predictor set have been retained 
as significant factors. They are, in order, the three-factor 
social class score: race; birth order: maternal raw score on 
the Ammons’ Quick Test of verbal intelligence; and 
paternal level of schooling. The last elements in the tree, 

groups 8 - 9, 12 - 13, and 14 - 15, were composed from 
data on maternal (QT scores, paternal level of schooling, 
and birth order. These last three splits raised the propor- 
tion of assigned criterion variance, but not a a statisti- 
cally significant level. Of the total variance accounted for 
by the full model, R?- .24, nearly two-thirds were ex- 
plained by the first two factors, SES and SES-race. The 
series of splits which generated the model schematized in 
Figure | was generated through Group 3, whose levels 
Pp represent all but the high levels of socioeconomic 


at age 
ledge 
nt formulation of the 


chnical, clerical and sales, 
nding, respectively, to DOT 


status. f : ; 
Figure 2 presents essentially the opposite process o 

elaboration through low levels of the prime variable. In 

this case the predictor is maternal age at the time of 


Table 2.—Regression Models of Fulton Test Criteria and Their Levels of Statistical Significance 
I————————— ÉÓÉREem ano des Levels of Statistical Significance — — 


delivery of the child. The upper levels of delivery age in 
Group 3 were slightly extended late in the splittitig aid 
cess by Groups 12 and 13. In contrast, the Group 2 rep- 
resenting the younger mothers created the remaining Len 
cells of the tree. In this, the more extensive branch of the 
AID4 tree, there is repetition of the role of mothers 
PATE scores in Groups 8 - 9, and, later in the Segirenco of 
decreasing contribution to the regression model, in Groups 
14. - 15. This last split raised the proportion of assigned 
variance to R? = 13. There is an interesting aspect to the 
predictor set schematized in Figure 2; it is that the first 
three predictors are all maternal traits—age at delivery of 
the study child, attitude to education, and attained level 
of education, Actually, 
dictors is a mate: 
uninfluential element of paternal level of education. l 

In Figure 3, the AID4 diagram for the third criterion, 
the clerical and sales subgroup score, shows a tree which 
is asymmetric. Group 2 containing 40 children represents 
low scores (high levels) of social class scores. This branch 
was relatively unproductive, and was completed quite late 
in the process of raising R? values by maternal educational 
level in Groups 14 - 15. A majority of the children were in 
Group 3 and had lower levels of social class background; 
the AID-4 diagram was largely developed through them. 
The first split from Group 3 was by maternal QT score, 
and is seen in Groups 4 - 5, The lower branch ended in 


Groups 8 - 9, based on fathers’ educational attainment. The 
mean of the 


criterion sco: 
1.25 standard deviations below the mean given in Group a 
for N = 180 children. The more elaborated development o 


the AIDA tree begins with the score for QT in Group 5- 
It splits 


via G 
attitu 


all but one of the seven pre- 


Toup 6 by the PATE score which expresses maternal 
ide to education. The final split of this branch ends 
with Groups 12 - 13, which are based on maternal QT 
scores, 

Figure 4 shows the selection, priority, and nature of 
the variables associated with the Fulton test score for 
the construction subgroup, occupational category #8 
of the DOT. Quite the most complex tree, it represents 
a regression model which achieved an R? value of .23, 


rnal trait. The sole exception is the relatively 


- x ~ an 
lowest group of scores is in Group 8. The "s 
te of Group 8 was 2.45, which is approximately 


into Groups 6 - 7 based on the STIM score, followed 


— 
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SES 


Criterion: Group 1 Occupational 
Knowledge 


M= 22.68, o= 3.04 


Figure 4.—AID-4 Tree for Structural Work Occupational Subgroup 


through nine splits into 21 cells. The prime source of 
attributed variance is the three-factor social class score. 
The branch containing low levels of the SES score (high 
SES) is completed by low-numbered Groups 18 - 19 
based on STIM score. The low side, Group 3, is the basis 
for the elaboration of the tree into 21 groups. Maternal 
age at delivery, Groups 4 - 5, is the Variable through which 
the process of splitting began. On the high side of this 
variable, Group 4, subsequent elaboration through birth 
order, and delivery age, and, reciprocally, the STIM and 
PATE variables in Groups 16 - 17, and 20 - 21, can be 
seen. On the low side of delivery age, Group 5, elaboration 
through the variables STIM, maternal educational level, 
and sex is seen. This last finding is interesting because 
Groups 14 - 15 are the only instance ofa boy/girl con- 
tribution in all five AID interaction regression analyses, 


Discussion 


Findings 

The major descriptive finding of the study is that iak 
are interesting differences in the Ta E occupational 
knowledge in the four groups examined: Based on the full 


Group 2 
FO RN 
M 224.53, N 2 15 
R?= 17, p= .004 
Levels 1- 4 


Group 3 
SES 
— 


M=21.81, N=32 
R= .17,p=.004 
Levels 5 - 8 


Group 4 
Mom's QT 


M= 22.67, N= 18 
R?- .24,p = .05 
Levels 4-7 


Group 5 
Mom’s QT 


M= 20.71, N= 14 
R*= .24,p= .05 
Levels 8- 11 


Score on the Fulton test, the average child’s knowledge 
of all occupations presented by the criterion test is 80%. For 
the first occupational subgroup (DOT # 1), the mean num- 
ber of occupations known is 34%. For the second oceupa- 
Fines subgroup (DOT # 2), the mean score is 60%. For the 
third occupational subgroup (DOT #8), the mean score is 
86%. When reviewed by occupational categories the 
t level of knowledge possessed by the children was 
1 * construction occupations—86%, followed by cler- 
ical occupations—60%. Least knowledge is demonstrated 
for the professional and technical jobs—3446. 
NL to the AID4 analyses, it is helpful to note the 
ent to which the groups’ means in the trees represent 
a Spreading.out of criterion scores. Table 3 gives the grand 
ig and the highest and lowest group means from the 
our AID4 analyses, wherein + one SD values for the total 
Scores on Fulton test are 26.89 and 21.33. The highest 
soup projected by the AIDA analysis in Figure 1 is within 
d e boundary, but the low group is below the minus and 
Sigma value, In the case of Figure 2, the high and low 
boundaries are 3.92 anq 2.28. The high and low cells of 
the tree in Figure 2 approach but do not exceed the * one 
sigma range. In the case of Figure 3, the boundary values 
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Table 3.—Grand Means and High Group and Low Group Means 


ANALYSIS MEAN SIGMA 


Figure 1 2.78 
Figure 2 3.10 82 
Figure 3 3.67 1.00 


Figure 4 


GROUP MEAN 


HIGH LOW 
GROUP MEAN 


2 25.88 15 21.25 
11 3.77 12 2.46 
14 4.67 18 2.45 


Table 4.—Primary, Secondary, and Tertiary Predictors in AID-4 Regression Models 


PREDICTOR 
Three-factor social class score [McGuire & White T" 
a7)] 

Race (%W) 2 

Paternal education 

Maternal education 

Birth order 3 
1 


Maternal age 
Sex (76M) 
Maternal IQ [Ammons & Ammons (01 
STIM score [Caldwell (5)] 
PATE score [Meddinus (18)] 


*] - ordinal position in AID4 tree 


arc 4.67 and 2.67. The high and low groups have mean 
scores at or beyond these boundaries. For Figure 4 the 
boundaries are 5.11 and 3.53, and the high and low group 
means approach but do not exceed these values. In general, 
the spread of criterion scores from highest to lowest 
groups is almost (rom one SD above the mean to one SD 
below it. The AIDA trees are satisfactorily spread out on 


cither side of the grand mean scores. 

to a consideration of the variables in the 
children’s backgrounds which account for criterion variance 
of the four Fulton test scores, Table 4 lists the first three 
predictors in order of magnitude for four criterion scores; 
€. g., race is found once, in the tree for total score, as the 
second most influential predictor. Two of the predictors 
din the first three splits in creation of the 

x of the child and the level of 

he child’s father. Two variables 
its: educational level and 


Turning now 


were nol use 
four AID- trees: the se 
education achieved by t 
used only once were maternal tra ‘ 
attitudes to education (PATE). In contrast, one variable 

was used three times in the projection of AID-4 trees, and 


in cach instance it was used as the first split in the three, 


TOTAL 
SCORE 


SUBSCORE SUBSCORE SUBSCORE 
ONE TWO THREE (Ð 
1 1 (3) 
(1) 
3 a) 
(0) 
2 (2) 
2 
3 3 Q) 


the prime source of criterion variance. This variable is the 
three-factor social class score (SES), and it was the prime 
variable in Figures 3 and 4 and in the tree for the total 
score on the criterion Fulton scale in Figure 1. 

On the basis of the finding that SES has such a power- 
ful effect—as opposed, for instance, to the lack of an effect 
due to sex—it is helpful to look at Table 5. It shows pre- 
dictor and criterion scores displayed by quartiles of the 
SES score; e. g., four mean Fulton test criterion scores 
are presented for four levels of the perinatal social class 
score. Inspection of both predictor and criterion scores in 
Table 5 shows the trend to higher scores with rising SES 
level. 


Implications 


In considering the implications of the preceding materials 
for construction of elementary school curriculum mater- 
ials on occupations, it is first observed that there is an un- 
even degree of knowledge among seven-year-olds concern- 
ing the four occupational categories tapped by the Fulton 
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Table 5.—Mean Predictor and Criterion Scores Arranged by Social Class Score Quartiles 


m mS 
ov FO 
RD 
FIRST SECOND THI : E 
ALL Ss QUARTILE QUARTILE Opry QUARTIL 
=52 = 
PREDICTOR/CRITERION (N = 180) (N = 37) (N = 52) C 
53 51 
Sex (%M) 47 32 30 
0 
30 4 
Race (5B) 20 48 e 
ES 51.59 70.75 60.26 49.53 29. 
S P 
8.67 
Delivery age 26.31 26.18 24.25 26.23 2 
QT (19) 92 84 90 92 100 
Birth order 2.88 3.62 2.61 2.58 2.87 
85 
Mother's education 2.93 2.13 2.71 2.83 3 
STIM 34.66 31.43 34.36 35.41 36.75 
PATE 59.02 62.13 60.30 55.83 58.12 
Father’s education 4.19 2.70 3.55 4.23 5.95 
Fulton test total score 24.03 22.29 23.73 24.11 25.59 
Technical occupations score 3.09 3.05 3.09 3.00 3.20 
(DOT #1) 
Clerical occupations score 3.66 3.27 3.42 3.74 4.14 
(DOT #2) 
Construction occupations score 4.30 4.10 4.30 4.13 
(DOT #8) 


Test of Career Knowledge. Seven-year- 


olds seem to be most 
informed about the construction 


professional and technical 


jobs. This suggests that small children begin school with a 


quite restricted sense of work Opportunities, and points to 
a hiatus in their thinking which curricular materials could 
seek to remedy over a period of years, especiall 
to technical and professional challenges, 
Within this imbalance across occu 
is the consistent influence of social c| 


y in relation 


pational groups there 


lass. Children from the 
higher social classes are more informed about all four 


occupational groups than children who are less favored. 
Interestingly, this bias does not raise knowledge of tech- 
ical jobs disproportionately for the highest social class 
nical j Rather, the bias is for greater knowledge of all 
pM includie those at variance with the child's 
d nd. 
personal a "pn this inquiry we may identify some 
From ii in developing curricular materials, al- 
leads for strategy ? values of the regression models pro- 
though the low y! ates) innovation at this stage. 
vide no owe vem of knowledge of construction jobs 
First, the eas nature of such jobs. This suggests 
may reflect e ] iental materials may be of prime 
that olm the range of jobs known to children. 
value in broadening 


Second, the absence of a sex effe 
Sources of criterion va 
crimination factor in 
cordingly, it seems 
school with essenti 
edge. In view of th 
the socialization p 


ct in the set of major 
riance eliminates an initial sex dis- 
children's knowledge of jobs. Ac- 
that boys and girls begin elementary 
ally sex-less orientations to job knowl- 
© contrary pattern which emerges with 


rocess, it is helpful to know that the 
sex bias found in adolesce 


formation is not innate 

Third, the absence o 
Sources of criterion vari 
That is, the limited oce 
inner-city youth can be 
not evident at a 
of job choice 
viewed as de 
peers in the 


ent- and adult-women’s job in- 

or unavoidable. 

f a race effect among the prime 
ance is much like the sex factor- 
upational information of black, 
viewed as a developmental bias 

g€ seven years, Accordingly, the problems i 
—and job accessibility in adolescence may be 
velopmental in both black youth and in their 
white community. 


FOOTNOTE 


E the 
1. This study was Supported by grants from CEMREL, Inc., 
National Institute of Education, and the state of Missouri. 
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ACHIEVING HOME-SCHOOL CONTINUITY IN THE 
SOCIALIZATION OF AN ACADEMIC MOTIVE' 


ROSEMARY SWANSON 
RONALD W. HENDERSON 
University of Arizona 


free-choice situation. Analysis revealed a significant difference i 
group children. Implications for cooperative home-school effo; 


DURING THE PAST DECADE legislation has spawned 
a variety of programs intended to deal with the fact that 
the schools have been less effective in educating children 
from minority cultural backgrounds than in meeting the 
needs of the middle-class children whose parents generally 
design and operate the educational programs in this country. 
There is little agreement among psychologists and educators 
even about the appropriate questions to be asked and as- 
sumptions to be employed in seeking to remedy the dif- 
ferential effectiveness of educational programs for various 
sub-cultural groups. 

Two points of view have dominated the design of in- 
novative programs intended to remedy inequality in educa- 
tional effectiveness. The first and most influential vie 
been that culturally different children fail to profit a 
from school instruction as their middle-class peers because 
of intellectual deficits, which are attributed to inadequate 
intellectual stimulation in the home environment (5, 6, 7). 
A less widely held but vocally stated point of view asserts 

that the cause of culturally different children ’s failure to 
profit from instruction is that the programs and procedures 
of the schools are ethnocentric, and therefore inappropriate 
and misguided when applied to children from non-middle- 
class backgrounds. Proponents of this view claim that the 
“inadequate environment argument serves as the basis for 
institutional racism, as manifested in compensatory educa. 
— pur isi was designed to examine the viabil- 
; P ae and alternative set of assumptions. This third 
ily of J " othesizes that discontinuities between the 
rici n and child training practices of the home 
valued obj 


w has 
s much 


; inhibit the effectiveness of school programs, 
E: ] may inhibi 
and schoo 


' view seems to offer greater potential for gen- 
Thie nointoof view seems 
This point o 


rts are discussed. 


erating productive solutions than the competing sets of as- 
sumptions because it suggests that an initial step toward a 
solution of the problem of unequal educational effective- 
ness would be to seek objectives which are mutually 
desirable to school personnel and parents, and to design 
mutually acceptable Strategies to influence children's prog 
ress toward these objectives both at school and at home. 

In this context, continuity is conceptualized as congruence 
between the objectives for children’s learning that are ] 
mutually valued by the home and school, and compatibility 
between the educational practices of the school and the 
socialization practices in the 
those objectives, 


home which are articulated to 


Theoretical Perspective 


: yita 
For reasons to be discussed later, the independent variabl 
selected for study in the 


present investigation was an aca- 
demic motive 


; interest in reading materials. The intervention 
d to bring about changes in children’s preference 
of reading materials whe 
petition with attractive alternative choices. " 
The assumptions about motivation which are most widely 
accepted in education have been derived, usually inform- 
ally, from theories of personality, which generally concep- 
tualize motivation as an enduring state or energy system 
which governs the organism s activities (1), Such theories 
fail to provide guidance for action programs because they 
leave it unclear how one would go about influencing in- 
dividuals to adopt new motives. A central question to be 
addressed in any effort to alter motives is “How do some 
people come to feel pride in academic accomplishments, Es 
while others feel no sense of satisfaction from such deeds? 


was designe 


n these materials were in com- 


ar meen 
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One view (13) suggests that when an event that has already 
acquired reinforcement value, such as praise, approval, or 
material incentives, is paired with a class of behaviors such 
as academic accomplishments, the behaviors associated 
with striving for academic success may acquire secondary 
reinforcement value. Since direct reinforcement for academic 
behaviors is delivered on an intermittent basis in the natural 
environment, such activities are likely to be maintained for 
quite some time in the absence of external reinforcement, 
and the observer who witnesses this behavior may well con- 
clude that the learner is intrinsically motivated. 

Support for this interpretation is provided by Winter- 
bottom's descriptive study of achievement motivation in 
the eight-year-old boys (15). He found that mothers of the 
boys with high motivation made earlier demands on their 
sons for independence and excellence than did mothers of 
boys with low achievement motivation. Moreover, when 
the boys with high achievement motivation met maternal ex- 
pectations, their mothers reported using physical affection 
more frequently than did mothers of the less motivated 
boys. 

Given these well-established theoretical considerations 
and support from descriptive research, there has been sur- 
prisingly little experimentation to validate experimentally 
the conditions which are thought to influence the develop- 
ment of academic motivation in young children. The be- 
havior modification literature is replete with studies in 
which children have been influenced to increase the amount 
of time they spend on academic tasks, or to be more persist- 
ent at academic activities. But to the knowledge of the 
present authors, the on-task behavior in all of these studies 
has been maintained through the use of externally imposed 
contingencies, e. g- (13). While incentives of various kinds 
may play an important role in school learning, it is impor- 
tant to find ways to influence children to pursue learning 
activities in the absence of external control imposed by 
teachers or others. An important long-range goal of parents 
and educators alike is to establish conditions that will re- 
sult in the acquisition or relatively enduring preferences for 
activities that may lead to further learning. 

Past research has established that there is a strong rela- 
tionship between home influences and school learning (4, 16, 
10) and that this relationship is an impressively stable one 
(8). Furthermore, research suggests that the degree to 
emonstrate that they value language and 
school-related behavior is highly associated with school 
achievement (9). Since home variables account for a major 
portion of the variance in children s school achievement, it 
scems reasonable to attempt to determine if differential 
ment of children’s choices of activities 
quency with which children 
„d activity, and if this 


which parents d 


parental reinforce s 
will lead to an increase in the fre 
select reading materials as a prefere 
influence will generalize to the classroom. , 

From the theoretical literature, descriptive research on 
achievement motivation, and studies of the relationship of 


academic accomplishment to home environments, the 


picture which seems to emerge is one in which children who 
develop motivation toward reading encounter reading 
activities within a wide range of established relationships 
with parents, siblings, grandparents, and other significant 
people. In general, they have dependent relationships with 
most of these people, resulting in a range of positive af- 
fective associations to cues in the setting in which these 
events take place. Consequently, the child who has had such 
a background has experienced great redundancy in ex- 
pressions of value toward reading. Expressions of "value 
reading" are in the form of models who read, and direct 
approval to the child for engaging in activities related to 
reading. This class of events may be thought of as an in- 
variant occurring within a great range of experiences in 
which other factors (nurturant people, situations, specific 
nature of the reading materials) are randomly varied. In 
addition to this heavy redundancy within a highly varied 
range of events, most of these experiences take place in 
situations of positive affect. It would come as no surprise, 
then, that a "motive for reading" acquired in this con- 
text would have a high probability for generalization to 
the school environment. 

This conceptualization of the development of motiva- 
tion toward reading guided the design of the intervention 
tested in the present research. 


Goal Selection 


Neither the goal of promoting continuity between home 
and school influences on the development of an academic 
choice or motivation in the children involved in this study, 
nor the specific objective of developing positive motivation 
toward reading activities, was selected arbitrarily. The pres» 
ent research represents an extension of an earlier parent- 
training program (11, 14) in which parents were success- 
fully trained in the use of specific socialization practices 
designed to facilitate the development of question-asking 
skills in their first grade children. At the conclusion of 
that program, the Title I Parent Advisory Committee 
which had initiated the first study requested a continuation 
of the program, with new objectives focusing on some as- 
pect of children's reading behavior. 

The program which was designed in response to this 
request aimed at teaching parents who were trained during 
the previous year to generalize their skills to a new set of 
responses in their children. The specific objective was to 
train parents to influence their children to be more inter- 
ested in reading and to evidence this change in motive by 
choosing reading materials with increasing frequency dur- 
ing free-choice time. Changes in strength of motivation re- 
lating to reading activities were to be assessed by exam- 
ining children's choices of reading materials in comparison 
with other attractive alternatives in a standardized free- 
choice situation, and by testing the generalization of the 
preference to a classroom situation. 

It was hypothesized that children whose mothers were 
trained in procedures designed to influence children’s 
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activity preferences would (a) show an — in = 
selections of reading materials over attractive et iv 

ta a standardized free-choice situation, and (6) pay : 
generalization of this preference to the classroom, y cl Si 
M reading activities with significantly greater frequency 
than a control group of classmates. 


Method 


Subjects 


The participants in this experiment were twenty native 
American Papago second grade children. All children were 
enrolled in two elementary schools on the Papago Indian 
Reservation in Arizona. These subjects were twenty of an 
original thirty whose mothers had been involved in a train- 
ing program during the previous year. While the original 
thirty children had been randomly selected, the twenty 
involved this year were those whose mothers were able to 
continue training for a second year. 

In order to kee 


p training groups small enough to pro- 
vide the individua 


lized attention, the mothers of the twenty 


domly assigned to two treatment groups. 
Training for one gtoup was completed before training for 
the second group began. 


Procedure 


Mothers of all twenty Ss were 
learning principles 
havior by increasin 
Two Papago wome 


trained to employ social 
to influence their children's choice be- 


g the positive valence of reading materials, 
n were trained to serve as parent trainers, 
Both women were bilingual and hence able to conduct 


teaching sessions in the primary language of the participat- 
ing mothers. Through a combination of the use of model- 
ing procedures, role-play, and Prepared written lesson 
plans, the paraprofessionals were taught to model the 
desired behavior of a mother and child informally exam- 
ining and discussing reading material. Furthermore, they 
were trained to demonstrate how a parent might reinforce 
the child for attending to the reading materials so that 
his interest could be sustained; and finally the paraprofes- 
sionals learned to demonstrate the use of verbal praise that 
parents might employ with their child when they engaged 
in self-initiated reading activity during a free-choice situa- 
tion. The paraprofessional women did not begin training 
the parent groups until they could perform and de 
strate mastery of these behaviors. 
Immediately following nai for the a 
essional change agents, parent training was initiated. 
fessional benedi yn js the first ten mothers before 
—— initiated for the second group. With the first 
ae thers, usually six to eight attended group 
group o ner while the remainder were trained individ- 
training sess 8, es. Because many mothers in Group II 
ually in their hom " two or three were able to attend 
were employed, only ks nd the remainder had to be 
group sessions regularly, a 


mon- 


trained individually. All group training sessions were T 
monitored by a member of the project research staff ; " 
the paraprofessionals had primary responsibility for the 

d g. 
goo was divided into five lessons. The first LS 
involved teaching the parents the appropriate paren t-chilc 
interaction sequence which involved a mother and child 
informally examining and discussi ng a book together. The 
paraprofessionals modeled the desired parent-child inter- 
action for a ten-minute period. Following the trainer 
modeling sequence, the mothers were divided into pairs 
and would then role-play the interaction sequence. Each 
mother was given the opportunity to role-play both the 
mother and child part, and role-play was continued until 
all mothers were able to perform the desired behaviors. 
Mothers were then instructed to conduct two training 
sessions with their child prior to the second training group 
meeting. At this point each parent-child session involved 
(a) performing the desired interaction sequence with 
their child for a ten-minute period, and (b) observing the 
children for one hour following the session and recording 
the amount of time the child engaged in self-initiated 
activity with the reading materials. 

Each subsequent training s 
ity and number of behay 
quired to master, Howey. 
added each session to av 


sion increased the complex- 
iors that the mothers were re- 

er, only one new behavior was 

oid confusion and to insure mas- . 
tery prior to the presentation of an additional novel behavior 
(see Table 1). Lesson 2 added the use of verbal praise for 
attending behaviors during the mother 
Sequence, and Lesson 3 a 


hild interaction 
dded the utilization of reinforce- 
ment for child-initiated reading activity during the one-hour 
observation period that followed the te 
child interaction, 
Since the goal of the 
preference for reading 


n minutes of mother- 


study was to increase children’s 

activities over competing attractive 
alternatives in a free-choice situation, it was necessary Lo 
include opportunities for decision. and choice-making dur- 
ing the training, Consequently, in Lesson 4 parents were 
trained to introduce their children to a novel and at- 
tractive additional stimulus and to allow the children to 
examine the stimulus prior to mother-child interaction 
with the reading materials. In Lesson 5 an additional novel 
stimulus was added which provided the child with a total 
of three alternative activity selections. In both cases, 
parents were taught to deliver verbal reinforcement to 
their children for the selection of reading materials over 
the other during the one-hour observation period. 

In this way children were presented with reading mater- 
ials in a highly valenced situation (i. e., interaction with 
the mother), but additionally they were provided with the 
Opportunity to make choices and to be reinforced for the 


selection of reading materials over more novel and vis- 
ually attractive stimuli. 


Data to evaluate thee 


ffects of the intervention were 
taken under two conditic 


ons. The first condition was a 
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Table 1.—Content of Parent Training Session 


Lesson 


Behavior in 
Training 


1. Parent-child role-playing 
sequence with reading 
material stimulus 


1. Parent-child role-playing 
sequence 

2. Role-play use of verbal 
praise with child for at- 
tention to reading stim- 
ulus 


1. Parent-child role-playing 
sequence 

2. Role-play use of verbal 
praise with child for at- 
tending behaviors 

3. Role-play use of praise 
during observation period 
for child-initiated reading 
activity 


1. Parent-child role-playing 
sequence 

2. Role-play use of verbal 
praise with child for 
attending behaviors 

3. Introduction of a second 
choice stimulus (puzzles) 

4. Role-play use of praise 
during observation period 
for child-initiated reading 
activity 


1. Parent-child role-playing 
sequence 

2. Role-play use of verbal 
praise with child for 
attending behaviors 

3. Introduction of a third- 
choice stimulus (blocks) 

4. Role-play use of praise 
during observation period 
for child-initiated reading 
activity 


Parent-Child 
Session 


1. Ten minutes of interaction 
with reading materials. 

2. Following parent-child ses- 
sion, observation and record- 
ing of child-initiated reading 
activity during a one-hour 
period 


1. Ten minutes of interaction 
with reading materials 

2. Use of verbal praise during 
session for attending behav- 
iors 

3. Observation and recording 
of child-initiated reading 
activity during a one-hour 
period 


1, Ten minutes of interaction 
with reading materials 

2. Use of verbal praise during 
session for attending be- 
haviors 

3. Observation and recording 
of child-initiated reading 
activity during a one-hour 
period 

4. Use of verbal praise for read- 
ing activity during observa- 
tion period 


1. Ten minutes of interaction 
with reading materials 

2. Use of verbal praise for 
attending behaviors 


3. Introduction of child to 
puzzle stimulus 

4. Observation and recording 
of child-initiated reading 
activity with puzzle stimulus 
present 

5. Use of verbal praise for read- 
ing-choice activity during 
observation period 


1. Ten minutes of interaction 
with reading materials 

2. Use of verbal praise for 
attending behaviors 


3. Introduction of child to 
block stimulus 

4. Observation and recording 
of child-initiated reading 
activity with puzzle and 
block stimuli present 

5. Use of verbal praise for read- 
ing-choice activity during 
observation period 


4l 
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situational task in which children’s choice behaviors d 
measured under stimulus circumstances similar to those. : 
which had been used in procedures employed by the child's 
mother during home intervention. These data were in- 
tended to show changes in the choice behaviors of children 
whose parents were trained, and were collected before and 
after parent intervention. The second condition wasa free- 
choice time in the child's classroom, which provided a 
measure of generalization effects to a new situation. A con- 
trol group was employed in this condition. 7 
For each trial under the situational task condition, each 
child was taken individually to a small room by a Papago 
E. The room contained three tables, upon each of which 
was displayed an array of materials which included reading 
materials (trade books), blocks, and puzzles. Placement 
of the stimuli on the tables was randomized for each 
child. The children were invited to play with any of the 
materials they wished, and they were free to change their 
choice at any time. Observational data were collected for 
a ten-minute period during which the E sat unobtrusively 
in a corner of the room. At each ten-second interval the 
observer marked a check-sheet to note in which of the 
activities the child was engaged. Scores were expressed in 
ratio form, and converted to decimal fractions by dividing 
the number of initiations with reading materials by the 
total number of activity initiations. The stimuli available 
for choice were similar to the ones used during training 
with parents. 

For the generalization condition three kinds of activity 
centers were arranged in the rear of the classrooms in 
which the sample children were enrolled. Again, each 
child was individually invited to interact with the stimuli, 
and behavior was observed with the interval recording ob- 
servation schedule. A control group was randomly selected 


from each classroom and similarly invited to interact with 
the materials. 


Results 


Descriptive data on situational task 
sented in Table 2. 

Situational task data were analyzed using a 2 (groups) 

X 2 (trials) repeated measures analysis of variance. Analysis 
revealed a significant trials effect (df = 1.18, F = 5.88, 

p< .05) for the experiment. Post hoc analysis conducted 
on the group effects revealed no significant group dif- 
ferences across testing points. Table 3 prese 
of the analysis of variance. 

The generalization test was evaluated with an inde- 
pendent group t-test, and a significant difference favorin 
the experimental group was revealed (df = 37, t= 5.2, 

p € 01). Intercoder reliability for the observation stii 
ment was 99% as determined by examination of the num- 
ber of observations in agreement divided by the total sili 


performance is pre- 


nts a summary 


ber of observations recorded. 


Table 2.—Descriptive Statistics 


For Situational Task 


Pre-test Post-test 


Experimental 


Control 


Between Groups 


Groups 1 .03 «1.00 
Error 18 .072 

Within Groups 
Trials 1 47 5.875* 
Trials X Groups 1 016 <1.00 


Error 


*p«.05 


Data from the recording sheets which parents maintained 
to note the amount of time the children engaged in self- 
initiated reading activity during the one-hour observation 
period revealed the following (1) 78% of the children en- 
gaged in reading activity for thirty minutes or more during 
the one-hour observation period, by the end of the training 
period; (2) 71% engaged in self-initiated activity for forty- 
more; and (3) 21% of the children engaged 
8 of self-initiated reading activity. 


five minutes or 
in sixty minute: 


Discussion 


The goal of the 


parent-training program reported here 
was to Increase 


à the valence of reading materials and hence 
to increase the frequency of reading material selection in 
the free-choice situation. Examination of the significant 
effects for trials would indicate that this was indeed ac- 
complished for children of parents in both training groups. 
By the end of training, nearly half of these children’s self- 
initiated selections were for reading materials. In addition, 
the amount of time the children engaged in reading activity 
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at home during training was impressive. Furthermore, on the 
classroom generalization task, experimental children 
selected reading materials more often than control children 
whose parents had not experienced training. 

A question that rises from inspection of the data is 
why, at the final testing phase, the second group of children 
did not attain as high a level of performance as did child- 
ren whose parents were in the first group. While this dif- 
ference was not statistically significant, it is suggestive. 
A much higher percentage of mothers of Group I (60-80%) 
were regular in attending group training sessions. In Group 
II, however, only 20-30% were in regular attendance, the 
remainder being trained at home by the paraprofessionals. It 
may well be that group training sessions are more effective 
for a program such as this than home visitations. Group 
training allows for the possibility of peer modeling in- 
fluence as the mothers have ample opportunity to observe 
each other as well as the paraprofessionals. Therefore, they 
are exposed to multiple-model input and accompanying 
repetition. This would not be possible in the course of 
training individually conducted in the home. Second, 
group sessions allow for careful monitoring by a research 
staff member. While not directly carrying out instruction, 
the researcher was able to observe parental behavior, note 
when criterion level was reached, and provide corrective 
feedback when necessary. Monitoring is not possible during 
home training. 


Implications 


The results of this study suggest that the procedures 
employed provide an effective means for parents to facil- 
itate the development of a motive in their children which 
is important to both the home and school. The study identi- 
fied a possible starting point for achieving home-school 
continuity, but certainly motive-creation of any practical 
magnitude is a very complex task (3). Academic motivation 
is developed slowly through experiences at home and 
school, and where those environments are very disparate, 
the lack of continuity between home and school ex- 
periences may inhibit the development of motivation for 
academic activities. For this reason, the present research 
was followed by a feasibility study involving a larger range 
of child behaviors which could become the focus for 
d effort between teachers and 


cooperation and coordinate 5 
made for each parent 


Papago parents. Arrangements were | c 
and the teacher of their child to define cooperatively a 


target behavior for the individual child and, with the help 
of project staff, to devise and implement an intervention 
plan which could utilize skills learned by the parent 
through prior participation in the program. l 

Approximately half of the mothers were involved in 
this pilot effort, and additional insights have been gained 
which should serve to guide future attempts Lo in fluence 
home-school continuity. In the “cooperative goal-setting” 
it seems clear that choices were primarily predicated on 


teacher goals and priorities, agreed to by parents. Parental 
knowledge of the school situation was very limited, and it 
would be unrealistic to expect parents to act assertively 
in an unfamiliar and somewhat intimidating environment. 
Teachers were equally unfamiliar with the children’s out- 
of-school behavior and capabilities. It appears that a nec- 
essary step for any comprehensive movement toward in- 
creased continuity between home and school in cultural 
settings such as this is for parents and educators to gain 
greater familiarity with each other's goals. In all likelihood 
this must be accomplished in individual face-to-face inter- 
actions, because while representative bodies such as parent 
advisory boards do exist, Papagos are very reluctant to 
presume to represent the opinions of other parents. It is 
common to hear a representative say, “I don’t know how 
others may feel, but as for me. . a 

Parents may be successfully trained in skills required to 
influence the intellectual competencies and specific motives 
of their children, but application of those skills to reach 
a significant number of jointly endorsed goals under con- 
ditions of severe discontinuity requires sustained effort 
and careful planning. 


Summary 


Discontinuity in goals and socialization practices of 
Anglo-dominated schools and the homes of children from 
ethnic minorities is a problem of national concern. A field 
experiment was conducted to test the effectiveness of pro- 
cedures designed to enable parents to influence children's 
motivation toward reading, a goal which provided a point 
of continuity in the values of parents and school personnel. 
Two groups of ten Papago native American mothers each 
were trained at different points in time by Papago parapro- 
fessionals to use reinforcement principles to increase the 
positive valence of reading as a free-choice activity for 
their second grade children. Effects of parent intervention 
were analyzed with a 2 (groups) X 2 (trials) repeated 
measures analysis of variance. The main effect for trials was 
significant (p < -05), and post hoc tests revealed no differ- 
ences between training groups. A generalization trial utiliz- 
ing a control group was designed to determine if training 
effects generalized to a classroom free«choice situation. 
An independent group t-Lest revealed a significant differ- 
ence favoring the experimental group children over the 
controls (p < .01). 

A follow-up feasibility study explored possible ways of 
using skills learned by parents in the training program to 
expand the base of home-school continuity. Implications 
of descriptive findings were discussed. 


FOOTNOTE 


1. The work reported herein was conducted under subcontract 
to Indian Oasis School District #40, Arizona. The project was 
supported by the Arizona State Department of Education as 
Project No. 74-912C, under the authority of P. L. 89-10, Title i 
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ini ressed in this report do not necessarily reflect 
us M UNDO GP DOUUy of the Indian Oasis School District, i 
or e Arizona State Department of Education, and no official 
endorsement by these agencies should be inferred. - 
Appreciation is expressed to the Papago parents and ch M 
who participated in the research, and to the school administra Ors 
and teachers who cooperated in the effort. We also wish to grate- 
fully acknowledge the contributions of Irma Dean Edmond, 
Elizabeth Siquieros, and May Galvez for their contributions to 
the field work. Thanks are also expressed to Jean Godier, who 
provided secretarial services to the project and typed the final 
manuscript. 
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ABSTRACT 


Differences between and changes over a semester 


[ ; on self-perceptions of 
low achievers. A self-rating scale, devised with 37 Student-suggested char: ei 
three different ratings on stanine scales, was used to c 


2 
Process Scale,” Journal of Social Psychology, 88:185-196, 1972. 


: two groups on isti ionifi " 
d shifts and interaction effects on all five group characteristics, Interesting as c aeu characteristics, and significant up: 
cues should be invaluable knowledge for instructors. EP 

a 


IT IS ASSUMED THAT intellectual and scholastic 

siue are prerequisite to success in college. However, 
2 researchers (2, 6) report that measures of academic 
some res s (2, 


ability alone are no 
level of academic p 
(4) report that the 


t sufficient factors in predicting the 
erformance, and other investigators 
re is no significant relationship between 


sl 
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college students’ abilities and their levels of academic 
performances. However, differences in levels of academic 
performances (whether they are labeled as high and low 
achievers or over- and underachievers) have been reported 
to be significantly related to different personality character- 
istics and behaviors, such as self-perceptions (1, 2), self- 
confidence, attitudes, motivational drives (2, 5), past per- 
formances (1), and strategies used in studying (4, 5)- 

Alexander (1) stated that the key to a student’s success 
or failure lies in his self-perceptions and how others (e. g.. 
teachers) interpret his performances. When a student per- 
ceives himself as a failure, he develops an anxiety which 
impedes and reflects his performance. 

In a study of over- and underachievers, Lum (5) reported 
that overachievers tend to be more self-confident, have a 
greater motivation for studies, and greater capacity for 
working under pressure than underachievers. The under- 
achievers were described as having a greater tendency to 
procrastinate, to rely upon pressures in completing assign- 
ments, and to be more critical of educational methodol- 
ogy and philosophy than the overachievers. 

Goldman and Hudson (4) found no significant differ- 
ences between abilities and high-, middle-, and low-grade- 
point-average groups, but they did find significant differ- 
ences among these groups and the strategies the students 
used in studying. Specifically, these significantly different 
strategies were found to be planfulness (reflecting 
punctuality and fore-planning, e. g., class meetings, com- 
pleting notes and assigned readings) and formal reasoning 
(reflecting logical and mathematical reasoning). The in- 
vestigators supported the idea that these specific study 
strategies may be more fundamental determinants of 
college students’ levels of academic performances than the 
students’ abilities. 

Studies on non-intellectual factors and academic achieve- 
ment have shed some light on characteristics of different 
types of students, but additional information Is needed in 
order to indicate which non-intellectual and/or personality 
trait(s) influence different levels of academic performance. 


The present study attempted to examine the problem 
by employing a different approach from those in the studies 
reviewed. It investigated (a) whether differences exist be- 
tween high and low achievers on their self-perceptions of 
student and (b) whether any changes occur dur- 

r for these groups in their sel [-perceptions of 
nt. (High and low achievers are referred to 
received grades of A and G, 


a quality 
ing a semeste 
a quality stude 
here as students who have 
respectively, in a course in which they were enrolled.) 

would be (a) significant 


It was hypothesized that there | 
rs in their self- 


differences between high and low achieve ae 

1 H » Di He rary zl ts 
ratings of a quality student: (b) significant upward s hi s 
in the trends over trials during the semester: and (c) signif- 
icant interaction effects between high and low achievers 
on characteristics of a quality student over trials. 


Method 


Subjects 


The Ss were 282 students (from different major fields 
of study) enrolled in introductory, educational, adolescent, 
applied, and industrial psychology, and clinical and learn- 
ing theory classes. Several instructors (male and female) 
taught the classes using their own styles and methods of 
teaching (e.g., lecture, discussion, group and individual 
projects and presentations). Ss participated on a volun- 
tary basis. Ss were not aware if (or how) the data were 
going to be analyzed. but were assured that their ratings 
would not in any way influence their course grades. Those 
Ss who rated themselves on all the characteristics ofa 
quality student during each of the three rating sessions 
were included in the study. 

After the semester ended, course grades were added 
to the students’ self-rating scales. From this group, students 
with course grades of A and C (high and low achievers, 
respectively) were selected. The Ss were 162 high achievers 
and 120 low achievers. 


Instrument 


The instrument used to collect the data was a self-rating 
scale with 37 student-suggested characteristics of a quality 
student. The instrument was designed so as to enable 
students to record three different ratings on stanine scales 
for each of the characteristics. The scales were numbered 
1-9, with “1” indicating the lowest (least desirable) level, 
“5” the middle (neither desirable nor undesirable), and “9” | 
the highest (most desirable) level. 

In the instrument, the characteristics were grouped 
into two categories: In Class and Out of Class. Nine of the 
37 characteristics, those related to classroom instruction 
and learning, were placed in the In Class category. These 
characteristics were: attended classes; came prepared to 
classes; was alert and attentive: participated in class dis- 
cussions; was open minded; was interested in the subject 
matter; understood course objectives; took good notes; 
and asked when I did not understand material. 

The remaining characteristics, in the Out of Class cat- 
egory, were grouped under four sections: Study Habits 
and Attitudes; Student-Student Relationships; Student- 
Instructor Relationships; and Physical and Emotional 
Needs. Fifteen characteristics fell under the Study Habits 
and Attitudes section; they were: had a good study 
schedule; had a special place for study: had a positive at- 
titude toward learning; was determined and studied hard; 
read textbooks and references assigned; was well organ- 

ized; used library effectively: did extra work for personal 
satisfaction; avoided hasty decisions; used dictionary 
effectively; admitted when I learned materials; evaluated 
myself often; did my best in all assignments; inter- 
1 À set goals and objectives. 
Student-Student 


related course contents; and 
There were three characteristics in the 
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Relationships section, namely, discussed topics and ideas; 
made friends in classes attended: and formed my own 
opinions. The section Student-Instructor Relationships 
consisted of the following four characteristics: got to 
know the instructor; asked instructor for help; respected 
instructors; and cooperated with instructors. The remain- 
ing six characteristics fell in the Physical and Emotional 
Needs section. Specifically, these characteristics were: 
had a well-balanced diet: had sufficient amount of rest; 
learned to relax; participated in physical activities; did not 
overload myself with work; and developed other interests 
and hobbies. 


Procedure 


Students were informed as to the meaning of the 1—9 
numbers on the scales and given instructionas as to how to 
record their responses. On three differ 
the semester (beginning, mid-term, and end), students 
rated themselves on the 37 characteristics, 

The administration of the self-rating scales was done 
during class periods, allowing as much time as the students 
needed to complete it. The three self- 
similar with one exception. During th 
rating sessions a Separate self-rating f 
of the separate form was to provide 


ent occasions during 


rating sessions were 
e mid-term and final 
orm was used. The use 
students an opportu- 


Table 1.—Means and Standard Deviations f, 


Three Trials for 162 Hi, 


CHARACTERISTICS 
OF ACHIEVERS 


In Class 


Out of Class 


Study Habits and 
Attitudes 


High 

Low 
Student-Student 
Relationships High 

Low 


Student-Instructor 
Relationships High 


Low 


Physical and i 
Emotional Needs High 


or Gri 
and 120 Low Achieve: 


nity to record their perceptions at that time without being 
influenced by their previous ratings. Then, the ratings . 
from the special form were transferred to the initial rating 
scale. 

For purposes of analysis, the ratings for each of the i 
previously mentioned characteristics in the /n Class anc 
the sections in the Out of Class categories were collapsed 
(Table 1). A trend analysis (3) was used to analyze the data 
in which the high and low achievers (groups) were treated 
as the main effects and the three ratings on each group of 
characteristics as the trial effects. Post-t-tests were com- 
puted for the trial and interaction (main X trial) effects 
for the groups of characteristics yielding significant Be 
ratios. The analysis of the post-t-tests for the interactions 


consisted of first adjusting the cell me 


ans by column and 
row effects. 


Results and Discussion 


The means and standard deviat 
achievers for each of the grouped characteristics of a 
quality student on the three trials are presented in Table 1- 
In Table 2, the F-ratios and their levels of si 
reported for high and low achie 
Group: trend effects under the 
action effects under the colum: 


ions for high and low 


gnificance are 
vers under the column 
column Trial; and inter- 

n Interaction. All post-t- 


ouped Characteristics of à Quality Student on 


7.224 9,986 
6.559 — 1461 
7.648 1.104 
6.881 1.709 
7.749 1.030 
7.233 1.370 
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Table 2.—F-Ratios for Groups (High and Low Achievers), Trials, and Interactions (Groups and 
Trials) on Ratings of the Grouped Characteristics of a Quality Student 


GROUPED CHARACTERISTICS 


In Class 

Out of Class 
Study Habits and Attitudes 
Student-Student Relationships 
Student-Instructor Relationships 


Physical and Emotional Needs 


** p <.05 


*p <.01 


tests reported as significantly different in the discussion 
of the report were significant at .05 level or lower. 


Differences between High and Low Achievers 

Three of the five groups of characteristics were signif- 
icantly different between high and low achievers. To this 
degree, the present hypothesis was confirmed. The three 
groups of characteristics, revealing significant differences 
between the groups, were the In Class category and two 
sections of the Out of Class category (Study Habits and 
Attitudes, and Student-Student Relationships). In each 
case, the high achievers rated themselves significantly 
higher than the low achievers. 


Differences over Trials 

The results of the analysis (Table 2) show that all of the 
groups of characteristics yielded significant differences 
(p € .01) over the trials. This was interpreted as both 
groups having indicated that they had changed over the 
semester on their self-perceptions ofa quality student. It 
is obviously difficult to determine which factor(s) influenced 
the changes, whether they were due to students better 
) hat characterizes a quality student, or 
due to changes that occurred because of the influence of 


various factors, such as familiarity of the environment, 
ce with peers and instructors, and/or 


ct matter. The fact that 
s occurred confirms the 


conceptions of wh 


better acquaintan 
better knowledge of the subje 
changes reflecting upward trend 
hypothesis made. 


Differences in the Interaction Effects 


All five groups of characteristics produced significant 
F-ratios in the interaction effects (Table 2), thus confirm- 
ing the hypothesis made. In order to perceive more clearly 


Group Trial 
df 1/280 


29.97* 


14.64* 


F-RATIO 


Interaction 


df 2/560 df 2/560 


63.66* 349** 
101.34* 5:25* 
9.49* 62.27* 5.57* 


the patterns which emerged between the high and low 

achievers on the five groups of characteristics, adjusted 

mean scores for the three trials on each of the five groups 

of characteristics are graphically displayed in Figure 1. 
From these data (Figure 1) one perceives that, generally, 

high achievers rated themselves low on the initial rating 

on these characteristics and then higher on the second and 


— HIGH ACHIEVERS 
== == =— LOW ACHIEVERS 
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Figure 1.—Adjusted Means on Three Trials for Characteristics 


of a Quality Student 
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third ratings, respectively. For the low achievers, the self- 
rating pattern was just the reverse. Low achievers rated 
themselves high on the first rating and significantly lower 
on the second and third ratings, respectively. 


Examination of the specific traits in each of the grouped 
characteristics seems to indicate that attitudes toward learn- 
ing may be the overall factor influencing a student's level 
of academic performance. Attitudes are closely related to 
such characteristics as attending classes, coming prepared 
for classes, alertness and attentiveness, etc. (listed in the 
In Class category). Attitudes toward 
influenced by one’s com 
self-confidence, best eff 


student-instructor 


realities should reduce, 
adequacy and tension and aj 


Conclusion 


The findings in the present study are in agreement with 
those of previous reports (1, 2) that students’ self-per- 
ceptions are related to their ley 
ance. In addition, the present study shows, more specif- 
ically, how self-perceptions of a quality s 
and low achievers differ and change over 
Knowledge of the traits descriptive of hi 
achievers should hlep instructors in better understanding 
their students. As a result, instructors might take a positive 
approach by teaching students how to study, help students 
develop positive attitudes towards learning, assist students 
in setting specific objectives and long-range goals, and 
assist students to make frequent self-evaluations. Being 
cognizant of the fact that some students will need more 
help than others, an instructor may then provide ex- 
periences in which students can succe 


these students in developing a better perception of them- 


selves. These approaches may be a beginning in a direction 
leading to higher academic performances, 
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ANALYSIS OF THE UNIT TESTING COMPONENT 
OF THE PERSONALIZED SYSTEM OF INSTRUCTION 
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ABSTRACT 


The present study compared examination scores of 173 undergraduate students in a course taught by the Personalized System of 
Instruction under three conditions: required unit testing, optional unit testing, and no unit testing. Each of the six course sections 
was randomly assigned a different testing sequence in order to determine whether the success of PSI had been due to unit testing 
procedures or to the study questions. The dependent variable was a 65-item multiple choice examination administered at the end 
of each three-week phase of unit testing. The results of a Lindquist Type I ANOVA indicated a significant sequence by conditions 
interaction (p <.05). Tests of simple main effects showed that when required unit testing came first in the test sequence, the scores 
on the subsequent exam were significantly higher than the scores on the same exam following the other two unit test conditions. A 
Lindquist Type III ANOVA revealed a significant interaction between GPA and unit test conditions. Tests of simple main effects 
indicated no significant difference among the three unit test conditions on exams of students with GPAs in the upper 25 % of the 
class. Students with GPAs in the lowest 25%, however, attained their highest scores on the exam which followed the required unit 
testing, and scored higher on the exam following no unit testing than on the exam following optional testing. 


AS THE KELLER PLAN (7), referred to as the Person- 
alized System of Instruction (PSI), gains increasing at- 
tention, its merits are being evaluated. The PSI method 
is based on identifying the key elements of a course 
with study questions from which unit tests and course ex- 
aminations are composed. Each student is required to 
achieve a predetermined criterion on a series of unit tests 
before proceeding to the course examinations. Immediately 
after cach unit test, the proctor is available to provide 
feedback to the student. Evaluative research on PSI has 
revealed definite characteristics of this system as compared 

the lecture method. 

M Several researchers (2, 9, 13) have reported that 
students score higher on final examinations in PSI format 
courses than students in the lecture format courses. It 
has also been demonstrated that students score higher 

on essay tests (2, 13) and express more favorable at- 
titudes towards the course with PSI than with the lecture 
method (9, L1, 13, 14). Furthermore, with personalized 
instruction there is usually a skewed distribution with 
many A’s and few low grades (l, 3, 4, 7). Recent research 
indicates that two major components of PSI wh ich in- 
crease students’ performance are the study questions and 
unit tests prior to examinations (5, 6, 10, 12). 

The effects of unit testing, however, have not been 
isolated in any of the preceding studies. In order to 
determine whether the success of PSI has been due to the 
unit testing procedure or to the study questions, the 
present study compared student achievement on course 
examinations after required unit testing, optional unit 
testing, and absence of testing. 


Method 


Subjects, Materials, and Course Format 


One hundred and seventy-three undergraduates enrolled 
in six sections of adolescent psychology served as Ss during 
the Fall 1973 and Winter 1974 quarters. All sections were 
taught by graduate teaching assistants. All students were 
given a course syllabus which organized the reading as- 
signments, audio-visuals, guest speakers, and study questions 
into nine units. Approximately one week was devoted to 
each unit. 


Each student was then instructed to write out his own 
grade contract, which could be altered at any time through- 
out the quarter. The required components of all con- 
tracts were four course examinations and three unit tests. 
Additional credit could be earned through the following 
activities: book reviews, field projects, class presentations, 
class attendance, verbal participation in class discussions, 
and student aid programs. At the end of the quarter, all 
points were converted to letter grades according to the fol- 
lowing scheme: A = 285 points with a minimum of 27 
on the three combined unit test scores and a minimum of 
210 on the four course examinations, B = 260 points with 
a minimum of 24 on the three unit test scores and amin- 
imum of 180 on the four examinations; C = 210 points 
with a minimum of 18 on the three unit tests and 140 on 
the four examinations. Credit from the optional activities 
could be added to these test minimums to attain a desired 


grade. 
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Table 1.—Linquist Type I Analysis of Variance 
T———M—————————— 


SOURCE df 


SS MS F 


Between subjects 24849.688 144.475 
Sequence (A) 5 2392.125 478.425 3:5577* 
Error (B) 167 22457.563 134.476 

Within subjects | 346 6594.937 19.061 
Tests (B) 2 722.375 361.188  22.55* 
AXB 10 522.832 52.283 3.26* 
Error (W) 334 5349.730 16.017 

Total 518 31444.625 60.704 


*p X.01 


Table 2.—Mean E. 


After Required 
Testing 


Mean SD 


*R = Required Unit Testing 
O = Optional Unit Testing 


N = No Unit Testing 


Unit Tests and Course Examinations 


The course was divided into three phase. 
weeks each. During one phase the unit tests were required, 
during another the unit tes s were not available, and dur- 
ing the third phase the quizzes were available but the 
COTES were not recorded. The three phases were adminis- 
tered to cach of the six s rent, randomly 


s of three 


-nord sequence. 

— 4 : iie three-week required testing phase, students 
During a list of times and places when proctors would 
were given a The students were then instructed that each of 

be available. zes was composed of ten short answer 

the three nh selected from a pool of thirty ques- 
i . i 7 the syllabus for each unit. Students 
tions ru ciiin forms of each quiz up to three times 
could take 2 : 


questions ran 


xamination Scores after Each Testing Condition 


— aer aeh Testing Condition OOOO OO 


Unit Testing 
Sequence* | N 


After Optional After No Unit 
Testing Testing 
Mean SD Mean SD 


lo meet the 


mini 


um criterion 
Quizzes were admi 


for their contracted grade. 
nistere, 


ars d by graduate student proctors 1n 
à designated classroom. Immediate feedback was available, 
al though students were nol required to stay for the feed- 
ack, The three required quizzes had to be taken betoro 
the examination over those three units was given in class. 
Class time Was not utilized to answer or lecture on the 
study questions, 


Studen ts were also in formed that the 
Our major course e 


from study questio 
examinations Was c 


questions on the 
xaminations would come primarily 
ns in the syllabus. Each of the three 
` omposed of 65 multiple choice items, 
Covering three units of materials. The fourth multiple 


cl ice ^Y 2 a i i i 
hoice exam was a comprehensive test covering most of d 
Course Content, 
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Experimental Design 


A Lindquist Type | ANOVA was utilized (8) with the 
unit testing condition as the repeated measures factor 
pd the sis sequences of testing as the between-subjects 
actor, There were three levels of testing and six levels of 
sequence group. The data were further analyzed by use of 
: Lindquist Type HI ANOVA with cumulative grade point 
average as a blocking variable. The high and low 25% of 
each group were the two levels of GPA. 


Results 


The results of the Lindquist Type 1 ANOVA indicated 
à significant sequence by test condition interaction as 
shown in Table 1. The three conditions of unit testing 
were differentially effective depending on the order of the 
required testing sequence (Figure 1). Due to the presence 
of a significant interaction, tests of simple main effects 
us applied. Means, number, and standard deviations for 
each group are shown in Table 2. Tests of simple main 
effects and differences between all pairs of means re- 
vealed differential effects on student examination scores 
or the following sequences: RON, ORN, and RNO 
oe a ge testing, N = No testing, O = Optional test- 
2 BOE the RON and ORN sequences, the scores on 
Marit which followed the required unit testing were 
e Heantly higher than the scores following the 
"onditions (p < .05). For the RNO sequence, the exam 
ae significantly higher 
(p < 05). No 


es followed 


a Pa required testing were 
dirt the scores after the optional testing 
erences in subsequent examination scor 
!€ optional and no unit testing conditions. This can be 
attributed to students not taking tests under the optional 
testing condition. (Only 2 of the 173 students chose to 
la © unit tests.) 
] 1 he results of a Lindquist Typ 
4 Significant interaction between 
dition as shown in Table 3. Testing co 
lerentially effective for students with 


e III ANOVA indicated 
GPA and testing con- 
nditions were dif- 


high and low GPAs 


$ 


w 
au 


Test Scores 


w 
te] 


ONR 


RON ORN 


Figure 1.—Sequence by Tests Interaction * 


other two 
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(Figure 2). Means, number, and standard deviations for 
each group are shown in Table 4. Tests for simple main 
effects for each level of GPA indicated no significant 
differences among the three unit testing conditions for 
students with a high GPA. However, a Newman-Keuls 
indicated that students within the low GPA group 
scored significantly higher on the exam taken after required 
unit testing than on the exams following the other two 
conditions (p < .05). In addition, the exam scores fol- 
lowing no unit testing were significantly higher than the 
scores following optional unit testing (p € .05). Since so 
few students chose to take unit tests under the optional 
testing condition, it had been expected that the results 

of the optional and no unit test conditions would be 
essentially the same. Perhaps the students with lower GPAs 


55 
50 a ERN 
High GPA 
3 
pe 
n 
"I 
3 
"T Low GPA 
35 
i No 
Required Optional 
ex Tests Unit Tests Unit Tests 
Figure 2.—GPA by Tests Interaction 


*R = Required Unit Testing 
O - Optional Unit Testing 
N = No Unit Testing 
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Table 3.—Linquist Type III Analysis of Variance 


SOURCE MS F 


Between subjects| 71 12476.398 177.132 
GPA (B) 1 5995.563 5995.563 71.74* 
Sequence (C) 5 1340.004 268.001 3.2067* 
BXC 5 226.246 45.249 .54 
Error (B) 60 5014.586 83.576 

Within subjects | 144 2964.102 20.584 
Tests (A) 2 531563 265781 17.00* 
AXB 2 160.539 80.270 5.135** 
AXBXC 10 144.430 11.443 0.73 
Error (W) 120 1875.82] 15.632 

Total 215 


15540.500 72281 


After Required After Optional 
Unit Testing 


t Aft 
Unit Testing Testing” 
Mean SD Mean 


SD Mean SD 


= Required Unit Testing 
* - Optional Unit Testing 
N = No Unit Testing 


WM rn 
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m I pressure to work under the no testing condition, 

en rc sulted in higher exam scores. 

e "à in agreement with those of previous 

changed E ? babes compared exam scores of students who 

P bv m lecture to personalized instruction. In the 

top half SEUA i al., students whose scores were in the 

by thecha a class on the first exam were not affected 

Sinalized tot, in teaching procedures. The change to per- 

dn i iustruebob, however, produced an increase in 
scores of students in the lower half of the class. 


Discussion 


PE are several implications of the present study for 
creasin s teachers. The most effective method for in- 
is to pe ario ts’ examination scores under a PSI format 
early in sane a required quiz over the study questions 

enefit he course. F urthermore, students with low GPAs 

igh GPAs C from required testing than students with . 
tee s. Given the choice of whether or not to be given 
of i iae. quiz over the study questions, the majority 
course ants chose not to be tested, and their subsequent 
obtain n scores did not increase above scores 
not EL during the required testing phase. The decision 
of stud ake unit tests did not markedly decrease the scores 

: oe with high GPAs but did significantly decrease 
with c of students with low GPAs. Providing students 
ation a questions did not significantly improve examin- 
over a" unless a required unit quiz was administered 
theref, € questions prior to the course examination. Teachers, 
order edis might explain this phenomenon to their classes in 
the de © prevent students with lower GPAs from making 
cision not to be tested over study questions. 
or oun of the present study with graduate c! 
effica ic school populations would further clarify the 
migh Cy of unit tests and study questios. Comparisons 
£ht also be made between classes who remain on a re- 
Sa testing schedule throughout a course and classes 
are tested only on a final examination- 


w 


10. 


ll. 


12. 


13. 


14. 


Type Objectives on Subject-Matter Learning, 


. Jenkins, J.R.; and Neisworth, J.T., "Th 
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ABSTRACT 


THE APPLICATION OF Operant condi 
niques to classroom instruction has been a 
recent research [see (8)], particularly in th 
dividualized instruction (4). 

In most personalized instruction methods, the course 
material is divided into discrete, but interdependent, units, 
Students proceed through the units sequentially and are 
required to demonstrate mastery of the units by taking 
exams on the unit material, With an emphasis on what 


pupils know and not on what they don’t know, students 
may retake exams on units in wh 


tioning tech- 
topic of much 
€ area of in- 


ich exam performance 
indicates insufficient mastery of the material. Most often 
students are required to pasg units with a perfect score, 
the final grade being determined by the number of units 
assed. However, sometimes students are not required to 
P m dividual units, their grade being determined instead 
pass ^r tal number of points accumulated on all the 
by the te considered collectively, Further, in many in- 
unit exar M | comprehensive exam grade and/or 4 lab- 
stances a ait is averaged in with the grade from the unit 
oratory oo cina the final course grade, 
exams to de D» 'ompared personalized courses to tradi- 
Studies haw = 6) and (1). and have also examined 
= ics bi contributory to the SUE 
uch may 


3S of per- 
= ae R Fa a D P € 
s i | methods. For example, Farmer et al, (2) found 
sonalized me s. 


I "d bette than thos 

5 rformec èr than those of non. 
nts of proctors pe 

that students o 


- Group 3 students Were required to accumulate 90% 


ing a grade of A on unit exams (representing 759; of the course grade) in a college 


nit exams. Group 2 students were 
% of the cumulative total points 
tly superior to those of the other groups in: (a) their 

l course grade. Also, there were 


Proctors in personalized courses, Semb et al. (7) found 
that students did better on unit exams when test d 
questions were similar to study question items. In a stu d. 
by Johnston and O'Neill (3) unit exam Lippi SE 
were found to vary directly with the Pare soni wi 
necessary Lo receive a grade of A. Semb (6) investigate jit 
the same variables as Johnston and O Neill using a differe 
design and response measure, He concluded that a high 1 
mastery criterion produceq better test performance thar 

a lower criterion, 

Studies investigating variables influencing test per- " 
formance have typically employed within-subject design 
and have dealt with undergraduate populations. The -" 
present study investigated the role that differing mastery 
criteria would have 9n test performance of different " 
groups of students enrolled in the same course. In cme 
à graduate course that met only once a week Was er- 
for Study in order to extend the empirical findings © Lion 
sonalized courses to include a more "advanced? popula 
meeting on once-a-week schedule. 


tion 


Method 


Subjects 


5 ree 
The Ss were 102 students assigned randomly Ae 
sections of a course in educational psychology at 
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two dropouts ies 3: ; f on one EQ respectively. There were 
Section 3. The B » id Sections 1 and 2, and three from 
Side canner senior author was the instructor in all 

s ns. 


Procedure 


No Merle on, material was divided in to ten 
ead Rum. : eser study questions covering the 
Ps nd : ie by the instructor. Additionally, a 
"ig m " i r : or more test items was composed for each 
situa fie ca never duplicated study questions but 
same content area. All study materials were 
| section met once à week 
initial 45 minutes, 


‘ie same for each section. Eacl 
oe In each class during the i j 
akan ook a written exam on the weck s reading 

(i. e., one unit). The rest of the time was devoted 


lo gene x > 4 S z 
general discussion of reading material and pertinent 


SROS, There were no formal lectures. 

uda were covered one week at a time and students 
not take a unit exam before 

permitted to take 

[ten as desired by making 

dministered the 

n after the 

en days 


it was scheduled in 


Class: 
ass; however, they were other exams 


Pte leaner are " o ee 
Tetakes inis n WI ne m quie h em 
data o 1 his office. No retakes could De taker 
after «wars the next exam was scheduled (i.e., sev 
day, i gm y and only one retake could be taken per 
the der, and retake items were randomly selected from 
that ie i test items for each unit, with the provision 
iiam si lake item could have appeared on the original . 
ual had ies any previous retake that any particular individ- 
after idi en. Students were given grades immediately 
nie ing an exam or retakes. However, students could 
`p an exam or retake until after the next unit exam 


tal on Unit Exams, Unit E. 
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had been scheduled. This latter contingency, the large 

test item pool, and the seven-day limited hold discouraged 
students from passing potential retake questions on t s 

one else. f E 

Fach unit exam and retake was worth 20 points and 
was composed of five multiple choice items and five 
short answer items each worth 2 points. Multiple 
choice items were graded either 2 or 0, while short answer 
items were graded 2, 1, or 0. Only the best score for each 
unit contributed to the final grade. 

One week after the tenth unit exam was scheduled, all 
students took a comprehensive final exam. The final exam 
was worth 50 points, and no retakes were permitted. It 
was composed of 25 multiple choice and short answer 
questions graded just like items on the unit exams. In- 
cluded with the final, each student received the standard 
SUNY-Geneseo course evaluation form, in which students 
could rate the overall quality of the course on 5-point 
isfactory, 5 = excellent). 


rating scales (1 = unsat 
in the course, neither the 


In the grading of all tests 
students nor the sections that tests came from 
instructor (who did all the grading) 
re graded. Thus, all grading was 


names of 
were knowa to the 
until after all exams we 
blind to insure against score biasing. 

The course grade represented a 25% weighting of the 
final exam and a 75% weighting of the unit exams. On the 
final exam, letter grade equivalents of rounded point totals 
ascertained as follows: A = 45 - 50 (90%); B = 44 - 40 
5%); D = 35 - 37 (70%). As follows, 


letter grade equivalents of 


were 
(0%); C = 38 -40 (7 
groups differed according to how 
unit exams were assigne : 

the 34 students of Group lto 
had to get 90% of the total pos- 


ı unit exams (i. ¢., 


Group 1: In order for 
receive a grade of A, they 
sible points on a unit exam on all ten 


xam Grades, Final Exam Score, 


Table 1.-Group Mean Point To! 
Number of Retakes, and Course Grade 
Mean Total Mean Course 


Mean Total Mean Grade 


Points on 
Unit Exams 


Group 


*Differed significantly from Groups 1 and 3 


Point Average 
for Unit Exams 


(p <.00 


Grade Point 
Average 


Mean Final 


Exam Score Retakes of 


Unit Exams 
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18 of 20 maximum points). Grades of B, C, and D re- 
quired getting at least 18 points on 9, 8, and 7 of the ten 
unit exams, respectively. 
Group 2: The 32 students of Group 2 received a grade 
of A if they correctly answered all questions (i. e., 100%) 
on nine of the ten unit exams. Grades of B, C, and D re- 
quired perfect scores on 8, 7, and 6 unit exams, respectively. 
Group 3: The 36 students of Group 3 were required 
to accumulate 90% of the cumulative total points of all 
ten unit exams considered collectively (i. e., 180 out 
of 200 points) in order to get an A. Grades of B, C, and 
D necessitated the accumulation of 160, 150, and 140 
points, respectively (i. e., 80%, 75%, and 70% of the 
200 total points on all unit exams). 


Results 


The groups were compared with respect to the fo]. 
lowing six variables: (a) total points on the w. 
(b) grades on unit exam portion; (c) final exam Score; 
(d) number of retakes taken during the course; (e) final 
course grade; and (f) course rating. Table 1 contains the 
mean group values for each of the 
Analysis of the data of Table lw 


eekly exams; 


above variables, 
‘as as follows: 


that portion of the Course and assigning į 
; ng it i 
equivalent (GPA) for which A = 4m =e (hates 


and E = 0. An analysis of variance was then performed 
ed, 


© 
the other groups, which did not differ si 
each other. 
Final exam score: The groups differed si 
their performance on the final exam (F = 49 47 
p <.01). Post hoc t-tests revealed that Group 2 


Course grade: Di fferences in overall course perform- 

ance were evaluated numerically in a manner identical 

to that used in evaluating unit exam grades, The groups 
differed significantly from each other (F = 25.42, df = 

2, 99; p < .01). Post hoc t-tests indicated that Group 2 had 
: a higher grades than either of the other groups 
si in Oi for each comparison), which did not differ from 
a ther. 
4 2 contains a summary of the data analyses 


4 e performance, 
performed on the students’ course p 


Table 2.-Summary of Analyses Performed on the Data of Table 1 


SOURCE 

Total unit points 2 138.56 1.69 
Error 99 81.80 

Unit GPA 2 2.0049 61.50* 
Error 99 .1326 

Final exam 2 151.71 4247* 
Error 99 3.57 

Retakes 2 939.70 139.40* 
Error 99 6.74 

Course GPA 2.25 25.42* 


Error 


*p «.01 


Course evaluations: The mean overall ratings given to 
the course were 4.4 (SD = .38); 4.4 (SD = .29); and 4.5 
(SD = .19) for Groups 1, 2, and 3, respectively. The group 
ratings did not differ significantly from each other. 

For all students, a correlation was calculated between 
the total number of points on unit exams and the final 
exam score. The obtained correlation of .84 was signif- 
icantly different from zero (p < .01), indicating that 
students who did well on unit exams also tended to do 
well on the final exam. 


Discussion 


The data indicate that the students in Group 2 per- 
formed significantly better than students in the other 
groups as measured by mean unit exam GPAs, final exam 
Score, and overall course GPAs, and in so doing support 
Johnston and O'Neill (3) and Semb (6) using different 
populations and time Schedules. Thus, students who were 
expected to perform at higher levels were generally able 
to do so. Further study is required to determine if the 
differences were due to greater studying or some other 
factor. : 

" Interestingly, Students in Group 2 did not accumulate 
Significantly more points during the course. Examination 
of the data indicates that this resulted because students 
who had already Passed the required number of unit 
exams necessary for a grade of A did very poorly, that 
55, only got a few points, on one of the ten unit exams. 
(Most did Poorly on exam 10, although some did poorly 0? 
one of the other unit exams and passed exam 10.) Students 
who received A in the other groups consistently tended . 
to miss one or two points on each unit exam and never dic 
disastrously on any one exam. The superior performance 
of Group 2 students was su {ficient to produce significantly 
better final course grades. 

This study supported the findings of Semb (6) that 
higher criteria produce not. only better examination 
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performances but more retakes as well, indicating that 

in general, it may be difficult to avoid failures M adt 
exams. In this course there was a small but steady de- 
crease in retakes as the course progressed. Perhaps greater 
care should be given to shaping test performance so as to 
produce learning without errors and, in so doing, pos- 
sibly avoiding any bad side effects associated with failure. 
Further, retakes may entail extra time spent by instructors 
in administration of courses. In the present study, retakes 
took about 45 minutes on the average to administer and 
about 15 minutes to compose. Thus, Group 2 occupied 
more of the instructor’s time than the other groups. 

In the absence of a technology that can prevent unit 

exam failures, instructors may wish to weigh the advantages 
of the method used for Group 2 against the extra time re- 
quired for its administration. 


FOOTNOTE 


Em l. Requests for reprints may be sent to Edwin Carter, Depart- 
nt of Psychology, SUNY- Geneseo, Geneseo, N. Y. 14454. 
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ESERVICE ELEMENTARY 
RS IN THE PROCESSES OF SCIENCE 


ABSTRACT 


service elementa 


of this study. Seventy-five preservice elemen 


classified as treatment groups, whi w I 
up receiving the integrated proce 


mental period. The gro’ 


p. There was no signi 


of the 16-week experi 
f the science process skills. 


than did the control grou 
group relative to the application o! 


developments originat- 
changes in the science 
school teachers. These 
cerned with 


_ THE SCIENCE CURRICULUM 
p. during the 1950s have produced 
ul of prospective elementary 

Odifications have most commonly been con 


the pr P 4 
Process attribute of science. 


cience educators have been 
eservice elementary 


e processes of 


For the past several decades; $ 
aware of the necessity of providing P™ 
teachers with experiences in utilization of th 
science. These processes refer to the particular operations 
Aig one employs during the performance of science as a 

uman enterprise. According to Gagne, traditional science 


een differential science pro! 
f specific science process S! 


cess treatment and the ability of pre- 
kills which were identified for purposes 
s. Two groups Were randomly 

ted by post-test only at the end 
cores on the post-test 
and the control 


ned to three group: 


been deficient in accomplishing objectives re- 


courses have 
t of science (4). Science pos- 


lated to the process componen 
sesses a dual nature in that it is comprised of both product 


and process. Product is defined as the derivative of the 

scientific enterprise. The process component of science re- 
fers to the activity or the method by which the knowledge 
is derived. It is common knowledge that science courses at 


all levels have traditionally emphasized science product ob- 


jectives while simultaneously excluding science process ob- 


jectives. 
The National Associatio 


Education and Certification 


n of State Directors of Teacher 
(NASDTEC), in cooperation 
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with the American Association for the Advancement of 
Science (AAAS), has recommended that preservice clemen- 
tary teachers become involved in experiences with the proc- 
esses of science. Moreover, NASDTEC-AAAS has suggested 
that college instructors examine various instructional proce- 
dures which could enhance the development of preservice 
teacher competency in utilizing the science process skill 
operations (8). 


n the utiliza- 
of scientific 


There is probably considerable agreement that inquiry 
includes the process skill operations, Forms of inquiry, 
however, may be somewhat variable as regards the manner 
of employment of these skills, 

Science educators have vigorously 
sure of preservice elementary teache 
ess skills. It is not unexpe 


supported the expo- 
ts to the science proc- 
sure Could in- 
however, stresses 


science than did those 
ject matter. Menzel (7 
tary teachers exposed 
significantly higher achi 

of measurement and classific 
posed to traditional instructional te 

The question of teacher com 

science is in need of further Study, and the results of 
courses and experiences need to be ascertained. Blosser and 
Howe (2) state that science educators h i 
research efforts more to the 
teachers rather than element 


ence processes 
lan those Subjects ex- 
chniques, 


Petence in the Processes of 


training of Secondary 
ary school teachers, 

Therefore, this study was undertaken to investigate the 
relationship between differential Science process treatment 
and the competence of preservice teachers in the perfor- 
mance and application of science Process operations, Dif- 
ferential science process treatment relative lo this Study re- 
fers to divergent methods of involving Preservice elementary 
teachers with the science process operations, s 

The null hypotheses proposed for investigation were 
stated as follows: 


1. There is no difference in the performance of Specific 
"ed rocess tasks by preservice elementary teachers as 
dat A differential science process treatment Measured 
m. gin Busse for Teachers of Science (10). 
by the 


There is no difference in the application of science 
process operations by preservice elementary teachers as a 
result of differential science process treatment measured by 
the Measurement of the Application of Scientific Method- 
ology (10). 


The significance level for testing the null hypotheses was 
established at 0.05. 

For purposes of this study, eleven process skill opora 
tions were employed. These skills have been identified d ' 
the AAAS as applicable for the science education of preser 
vice elementary school teachers. The eleven skills amo " 
observing; inferring; measuring; predicting; Lui Ja 
classify ing; defining operationally ; formulating hypotheses; 
analysis of data: interpretation of data: and controlling 
variables. Several of these skills have been arbitrarily 
selected and defined as follows: ! 


: M i "M s the formu- 
1. Inferring The process skill which involves the fo 
lation of immediate e 


Xplanations or conclusions based on 
prior Observations eee 
2. Communicating The process skill involved prin 
Veyance of an idea by using spoken and/or written V 

diagrams, graphs, and other. visual aids 


Classifying Th. Process skill which involves the 
Ction of objects or events 

5. Controlling variables—The process employed i squire 
vestigating situations where a number of conditions req 
uniformity 


The differ 


n in- 


" sf viously 
ential process treatment identified previously 
e 
thods of Science process exposure, 
xperi- 
l. Small Increment Process Exposure (SIPE)— An "a e 
Xposed to science process treatment utilizing 


One of the eleven process skills 
è. A specific or discrete skill is the immediate 
Objective of the activity, 

tA Integrated Process Exposure (IPE)—An expediat 
group *Xposed to science Process treatment in which on 
eral Process skills üre employed collectively in the te 
** Phenomena, ‘The focus was conce"! 


Ors registering for the required elementary 
ds course at West Chester State College were 
ght Sections, Three of the cight sections sige 
T the experiment, Students were randomly as 


| 


WIDICK 


signed to all ei secti 
g ll eight sections by the experimenter. Class lists 


were maintained f ` 
fully.» iris d for all sections, and thesc lists were care- 
seru zed à [ 

"id an d i - of the experiment so that only 
BES nts on the class lists were allowe rì : 

experimental sections. cciam cR 

treatment was arbitrarily deter- 

t class meeting. 

second group IPE, 


. The particular ty pe of 
mined as each s : : 
d as each section arrived for the firs 


The first group was designated SIPE, the 
trol. No one was in- 


ar ; thi 
and the third group became the con 
xperimental situation. 


formed of the fact that this was an € 


Experimental Controls 

same instructional pro- 
representative as pos- 
treatment variable occur- 


« dus groups were exposed to the 
ve The product vehicle was as 
b . án . 
ible, with the manipulation of the 


ti : Am à 

ng through activity selection. 
Thee siti a 

'omposition of all treatment groups was determined 


ne sun i y et SHAEGIUS according to the data ob- 
oe as a resu t of a student survey and analyzed by em- 
p oying a x? test (Table 1), the existence of group homo- 
geneity, from the standpoint of background, can be 
assumed. 
cae of ten to twelve o'clock were selected as the 

d class meetings for all s scheduling was 
purposely established because extraneous ef- 
fects should some classes mee ag and others 
in the afternoon. Furthermore, " 
produced another effect should 
dus on any one day. All classes 
ducted throughout the experime 


Profess 1 
ssor, and all classes were hel 
re the same 


groups- Thi 
of possi ye 


tin the mornir 
atigue could have 


teacher f 
ction have 


more than one sc 
for all groups were con- 

ntal period by the same 

Jd in the same room. 

for all groups. Stu- 

in attendance since learning was 
ed direct participation. At no 
intermediate proc- 


Ohana on wel 
laborato 3 expected to be À 
tie Ar she and required | 

ig the experimental period were 
ess Lests administered to any of the three groups. 

A The science product as it was defined previously was 
similar for all treatment groups- The only possible differ- 
product served as the process vehicle for 
but it was the main objective of 


the control group- 
ey was d 


ence was that the 
the SIPE and IPE groups» 
the activities experienced by 
During the first class meeting, a surve 
to determine group equivalence in terms of sex; age 
experience, and other characteristics. Responses were ana- 
lyzed by use of the x? test (Table 1). Zach category identi- 
fied in Table 1 was subdivided into various classes, which are 


illustr 
strated as follows: 


dministered 
college 


l. ay 
5 Sex (M and F) (R = 2) (c7 3)4f7? 
- po (less than 20; between 20-22; between 
; em than 25) (R= 4) (C= 3) df= 6 
E tai Rank (sophor ; junior: senior): 
(R 7 3)(C - 3)qp - 4 

4. Pri "alle. k 

f a Golh ge Experience (all prior 
hester State College; transfer from ano 


23-25; 


more 


education at West 
ther four- 
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year college: transfer fi juni 

cup a ^ isfer from a junior college) (R = 3) 

5. Science Courses Completed (one-two courses; three 
courses; four or more courses) (R = 3) (C= 3) df=4 
6. Teaching Interest (K-3; 4-6; 1-6) (R = 3) (C = 3) df=4 


Table 1.—Chi-Square Values Cal 
culated from Responses t 
Experimental Groups sing 


Student Survey by Three 


Category 


2 

6 
Undergraduate rank 4 7.85 
Prior college experience 4 gás 
Science courses completed 4 0.61 

4 2.74 


Teaching interest 


‘The calculated x? values presented in Table 1 were not 
significant at the 0.01 significance level. Thus, it was con- 
d that the three groups were equivalent in terms of 


clude: 
egories indicated. 


responses to the cat 


Procedure 
Each group met two pe 

week interval. Each weekly class meeting was 

length. 

All groups were 

estigations of a typ 


riods each week during the 16- 
75 minutes in 


a total of eight broad science 
ly found in elementary 

thods courses of this variety. For the SIPE group; 
1 was subdivided into small process incre- 
ond a discrete process oper- 
ly with ob- 
of the eleven process skills 
ess activities were not 


dia for the SIPE 


Id be drawn at any ral of the eight 
The IPE Ss completed the investigations in an 
ential manner. These Ss were always 
ithin each investigation. The 
ticular broad investigation 
(GE for General 

pt that these 
cerned 


exposed to 
inv e general 
science me 
each investigatior 
ments. Ss could not proceed bey 
1, Such increments may have dealt exclusive 


or any one 
. The various proc 
and the science me 
time from seve 


ation 
serving, measuring, 
indicated previously 
organized in sequence, 
group cow 
investigations. 
as well as sequ 
total sequence W 
ntrated on a par 
The control group 
nce media exce 
t. This group was con 
d to science proc- 
| group intention- 


orderly 
aware of the 
IPE group conce 
during any given time. 
Exposure) utilized the same scie 
Ss focused on the science produc 
with learning science concepts aS oppose 
At no time was the contro! 
nce process skills. 

nts employed in this study for purposes 
of data collection were the Process Instrument for Teachers 
of Science and the Measurement of the Application of 
Scientific Methodology (10). Both instruments were admin- 
post-tests only at the end of the 16-week experi- 
riod. No test of any kind were administered 


riment. 


ess operations. ! 
ally exposed to sciet 
The two instrume 


jstered as 
mental pe 
during the expe 
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Instruments 


The Process Instrument for Teachers of Science isa 
modified version of the AAAS instrument, Science Proce 
Measure for Teachers-Form B (1). The latter instrument is 
concemed with the performance of discrete process tasks, 
i.e., observing, inferring, and the like. The modified version 
excluded items dealing with behavioral objectives and 
science process hierarchy relative to the 
curriculum project entitled Science: y 


(10). 


elementary science 
1 Process Approach 


during the semester of 
st results were analyzed 
which is a correlation based 
value of the Pearson r (0.78) 
orrelation significant at the 


Methodology was desi 


SITUATION 2 
Another group of student 
experiment and collected 
tions taken for ten trials. 
Below is the list of observations for the ten tr 
length of string being held constant. Each tim 
ment is for one single Period or one com 
from the starting position, across, and return, 


Length Time/Triat (seconds) 
(cm) 1 2 3 45 6 7 8 g 10 
? 1.94 1.95 1.93 1.92 1.94 1.96 1.95 1.93 1 5; 1.90 
The following comments refer to the data collected by 
the second group of students (Situation 2). Respond to each 
comment by indicating Agreement (A), Disagreement (D), 


or Partial Agreement (PA). Indicate by marking the proper 
column on your answer sheet. 


ials with the 
€ measure- 
plete movement 


1. The motion of the pendulum for each of th 


€ ten trials is 
iform in terms of distance covered per ti 
un 


me interval, 
? d be accepted because of conditions of 
* Smale can be assumed for all stials. 
uni 


h nditions under which this data was collected can- 
3. The co 
not be inferred. 


ten trials are to be expected 
e results for the 

S hepa a controls are enforced. 

wi 


5. The data for the above ten trials most likely are E ue 
sult of measuring the time for one single period M bed 
Stopwatch rather than calculating the average value ba 
on several trials. 

6. The data for the above 


sult of taking the avera 
trial. 


ten trials most likely are the d 
ge value of several periods for each 


When the Measurement of the Application of Scientific 

Methodology was administered to the random group of pre- 
service teachers during the 1972 
results analyzed by the Pearson r 
0.83 which was 


pilot project, pre- [post-test 
indicated a correlation of 
judged as significant at the 0.01 level (10). 


Results 


1 i " st-Lests were 
The scores achieved by all groups on the post-tests v 


subjected to analysis employing a one-way analysis of 
Variance and the Scheffé 


Tables 2 and 3 provide 
lo post-test scores, 


test of multiple comparisons. . 
s relativa 
Statistical data for all groups relative 


Table 2.—Post 
of Science 


eachers 
“Test Results for the Process Instrument for Te 


The Significan 
null hypotheses was 0.05, 


Table 3.—Post-Te. 
of Scientific Me, 


St Results 
thodology 


for the Measurement of the Application 


establishe, 
null hypotheses was 0.05, 


Tables 4 and 9 provide anal 
based on a o 


ysis of experimental results 
one-way analysis iance. 
igni nee at the beg calculated Exsditó 
Brees of freedom must equal or exceed me 
tabled value of 3.16, Therefore, the calculated F (4.296) is 
: he 0.05 level, and the null hypothesis of no 
erence between the three groups measured 
Y the process instrument was rejected. 

For Significance at the 0.05 level, the calculated Fvalue 
for 2 and 57 degrees of freedom must equal or exceed 3.10. 
The calculated F (3.692) is significant at the 0.05 level. 

erefore, the null hypothesis of no significant difference 


* ations 
tween the three groups as measured by the application 
Instrument was rejected, 
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Table 4, AN 
OVA Results for the Process Instrument for Teachers 


of Science 
Lu AMNEM M L^ 


Source 


Between 271.60 
Within wotg x a 
Total 2073.25 59 


* : el 
Significant at the 0.05 level 


Table 5.-ANOVA R 
. esults fx icatit 
of Scientific Methodology or the Measurement of the Application 


Source 


Between 
Within 
Total 
* Significant at the 0.05 level 


The Scheffé test of multiple comparisons (S-method) 
was employed to determine differences between group 
means for the post-tests. The Scheffé test can be employed 
to test the significance of difference between means sep- 
arately for all pairs of means where multiple groups are em- 
ployed. The S-method is defined as follows: 


Y X * X. tout 
E iat P A 
ci c32 c3 3 


where: 
the constants c, , €z» € are positive and negative real 
numbers that sum to 0, and 


W is the estimate of the contrast between means 


nce of any contrast V. can be judged, 


Before the significar 
be determined. The vari- 


the variance of the contrast must 
ance can be estimated by oy 


+ os * ci 
r. — — 
oy = MS, N, N, 


where: 


oy is the estimated variance of the contrast V, and 
MS... is the mean square within groups 

w 
f the absolute 
of J - 1) times 
hesis that V 7 0 


W = 0 is rejected i 
ds the square root 
ct the null hypot 


The hypothesis that 
value of the ratio excee 
i percentile point. Reje 
i 

-—— 
> 


%y C-D ss pp - i) (N73) 


is V2 (3.16) or 2.514. 


alie J- 
The value of VU - 1) 95 F (2-57) 


Therefore, the null hypothesis of no significant difference 

can be rejected when the F-atio( =) exceeds 2.514. 
0 y 

Tables 6 and 7 present the comparisons between group 
means on the post-tests by analysis employing the 5- 
method. 

The post-test results for the process instrument indi- 
cated significant differences for the three experimental 
groups (Table 6). There was no significant difference 
between the means of the SIPE and IPE groups. The 
achievement by the SIPE and IPE groups, however, differed 
significantly from the achievement by the GE control 
group. 

The mean scores for the SIPE and IPE groups were not 
significantly different. This outcome should not be unex- 
pected in view of the fact that both groups experienced 
direct science process treatment using science process skills. 
The difference involved the method of exposure. 

The analysis of group means on the applications instru- 
ment (Table 7) indicated no significant difference between 
the SIPE and GE groups. A significant difference existed 
between the means of the IPE and GE groups. 

Although the mean score achieved by the SIPE group 
on the Measurement of the Application of Scientific 
Methodology was higher than that achieved by the GE 
group, the results were not statistically different at the 
0.05 significance level. This outcome could possibly indi- 
cate that the exposure to singular, discrete process skill 
activities does not have any significant effects in terms of 
the transfer of these skills relative to functioning within a 
broader inquiry framework. Apparently, the SIPE group 


could not utilize the process skills in inquiry situations in a 


manner superior to the GE group despite the fact that the 


SIPE group had purposely been exposed to discrete proc- 


ess skill experiences and the control group had received no 


such exposure. 
The mean score achieved by the IPE group on the Meas- 


urement of the Application of Scientific Methodology was 
different from the mean achieved by the con- 
Id indicate that the achievement 
to that of the GE group 

kill treatment where the 
Jated fashion. 


statistically 
trol group. This result cou 
of the IPE group was superior 
because of the integrated process $ 
process skills were emphasized in an interre 


Discussion 


The development of competency in science processes 


ithin the framework of performance and 


when examined w 
the Process Instrument for 


application, as measured by ; 
Teachers of Science and the Measurement of the Applica- 


tion of Scientific Methodology, does appear to be influ- 


enced by the type of process exposure in science for pre- 
tary school teachers. The experimental group 


ted process skill activities appeared to 
rity over the other groups in the appli- 


to broader inquiry situations. These 
rns voiced by Jacobson (6) 


service elementary 
exposed to integra 
demonstrate superio 
cation of these skills 


results tend to verily the conce 
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Table 6.—Scheffé Test Analysis for the Process Instrument 


Y o? Woy, F 
Contrast v 
0.225 
X (SIPE) X (IPE) 40 3.16 40/1.777 
)* 
X (SIPE) X (GE) 5.05 3.16 5.05/1.777 2.84 
X (IPE) X (GE) 4.65 3.16 4.65/1.777 2.617* 


*Significant at the 0.05 level 


Table 7.-Scheffé Test Analysis for the Applications Instrument 


Contrast 


X (SIPE) -145 
X(SIPE (CE) 1.85 
XPE) YCE -3.30 


"Significant at the 0.05 level 
in that a science program for 
which emphasizes the discret 
desirable because the studen 
relationships of process ope 

In truly scientific inquir 
tools and integrated within 
experimenter to formulate 


preservice elementary teachers 
© process skills may not be 

t may not understand the inter- 
tations in science, 

Y; process skills are utilized as 

a broad context, enabling the 
decisions and generalizations 

d explanations of scientific 
phenomena, The application of a single skill or operation 
does not follow automatically because one has been 
exposed to that skill. The skill or Operation must be experi- 
enced within a somewhat realistic domain which might sug- 
gest and even facilitate the transfer of that skill to broader- 
based inquiry situations, 


Implications 


The process of scientific inquiry, which may be described 
as mental skills and habits essential to rational thinking, has 
historically been given high priority as an objective of ele- 


mentary education. Unfortunately, this objective has been 


1 practice, 
ing to 

n teacher 
s. The fol- 
€ study: 


most commonly a subject of discourse rather thar 

In order for the development of rational think 
become an educational reality, inquiry practices į 
education must be subjected to intensive analyse: 
lowing recommendations are suggested for futur 

g 

1. Competency levels for preservice elementary teachers 

regards inquiry skills should be identified, 
as re £ A ! j ae 

2. Identification of various instructional procedures for 
Ter training are in order. Included in this recommenda. 
inquiry UN 
tion are content variations in sorene. — " 

z 7 , s methods of teac 

3. Research dealing with various methods of tea hing 

inquiry should be undertaken. 


10. Wid 


- Glass, G.v. 
an 


- Menzel, E.W., *A Stu 
Cation 


- National A 


Yoy F 
7145/1.22 71.189 

1.85/1.22 1.516 
73.30/1.22 2.705* 
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